[FFmpeg-devel] [PATCH v2] Add SRV3 decoder/demuxer
fishhh
fishhh at fishhh.dev
Fri Dec 27 01:16:04 EET 2024
> instructions on how to obtain a srv3 sample file would be helpful too.
> this is so we can actually test your patch.
downloading the file from YouTube is probably the easiest way,
you can achieve this by the means of yt-dlp like this:
yt-dlp --skip-download --write-subs --sub-langs en --sub-format srv3
<URL>
--skip-download makes it so the video itself is not downloaded
if you need a video with complex subtitles for testing, this
one exists: https://www.youtube.com/watch?v=gxWAM5mgH_o
also I just realized adding `ctx->q.keep_duplicates = 1;`
somewhere in srv3_read_header significantly improves the quality
of some conversions because it keeps the queue from removing
events with the same content but different styles
this also prompted me to make EDGE_GLOW emit a smaller glow
(`{\\bord1\\blur1}`) which looks way better
you can test with those changes if you care about the result's quality
> usually we leave the version.h modifications out of patches posted to
> the
> mailining list, and only make those changes at git push time. this
> makes it
> easier to apply your patch in the future when those files may have
> changed.
> also for new codec we bump avcodec/version.h.
right, makes perfect sense, the documentation at
https://ffmpeg.org/developer.html#New-codecs-or-formats-checklist
should probably be updated to not tell you to change it yourself
> this patch sets adds a demuxer and a decoder. imho, it should be split
> into
> two patches. the decoder patch comes first, then the demuxer.
> ...
> allcodecs.c sections are sorted by name.
okay, I'll do those things for v3.
>> +enum SRV3PenAttrs {
>> + SRV3_PEN_ATTR_ITALIC = 1,
>> + SRV3_PEN_ATTR_BOLD = 2,
>> +};
>
> you use 0 elsewhere in the code.
> should there be a SRV3_PEN_ATTR_NONE too?
the difference between PenAttrs and the other enums is that
PenAttrs is a bitfield because the pen can be both italic and bold
of course.
now that I've looked into this it does seem ffmpeg uses macros
for defining flags so maybe I should just make these
#define SRV3_PEN_ATTR_ITALIC (1 << 0)
#define SRV3_PEN_ATTR_BOLD (1 << 1)
>> +typedef struct SRV3Pen {
>> + int id;
>> +
>> + int font_size, font_style;
>> + int attrs;
>> +
>> + int edge_type, edge_color;
>> +
>> + int ruby_part;
>
> these should be enum SRV3PenAttrs, enum SRV3EdgeType, enum
> SRV3RubyPart,
> instead of int. this helps static analysis. HOWEVER, as part of reading
> in the values you assume they are int and this approach has merrit too.
yeah I didn't do that specifically so I could just pass these as
int* to the parsing functions, it would be nice if we could just do
enum SRV3EdgeType : int {}
but that's a C23 thing so not really acceptable.
> = SRV3_EDGE_NONE
>> +#define FREE_LIST(type, list, until) \
>> +do { \
>> + for (void *current = list; current && current != until; current =
>> next) { \
>> + next = ((type*)current)->next; \
>> + av_free(current); \
>> + } \
>> +} while(0)
>
> syyle: the \'s should all line up
will fix
>> +
>> +static int srv3_probe(const AVProbeData *p)
>> +{
>> + if (strstr(p->buf, "<timedtext format=\"3\">"))
>> + return AVPROBE_SCORE_MAX;
>> +
>
> use strstr(p->buf, "<timedtext") && strstr(p->buf, "format=\"3\"");
>
> this handle the situaton wehere an extra tag is inserted between the
> two, e.g.
> <timedtext x="y" format="3" ....>
>
okay
>> +
>> +static int srv3_parse_numeric_attr(SRV3Context *ctx, const char
>> *parent, xmlAttrPtr attr, int *out, int min, int max)
>> +{
>> + return srv3_parse_numeric_value(ctx, parent, attr->name,
>> attr->children->content, 10, out, min, max) == 0;
>> +}
>> +
>> +static void srv3_parse_color_attr(SRV3Context *ctx, const char
>> *parent, xmlAttrPtr attr, int *out)
>> +{
>> + srv3_parse_numeric_value(ctx, parent, attr->name,
>> attr->children->content + (*attr->children->content == '#'), 16, out,
>> 0, 0xFFFFFF);
>> +}
>> +
>
> for both functions, should probably return the
> srv3_parse_numeric_value() error value
well actually the only reason srv3_parse_numeric_value returns an
error value is because attributes like "p" and "wp" have to
process the number if it is parsed successfully to find the
structure with the right id. I can make parse_color_attr return an error
but it'd be ignored in all cases.
> srv3_parse_numeric_value() returns an error, but you are discarding it
> everywhere here.
the error value from srv3_parse_numeric_value is discarded to allow
parsing slightly incorrect subtitles since a single event's invalid
opacity really shouldn't abort parsing and srv3_parse_numeric_value
already logs an error so its return value is only necessary in the
special cases I mentioned above.
> i would probably convert these attributes ("sz", "fs", "et", etc.), the
> pen struct offsets, and
> min/max values into an array, and then iterate through that array
> searching for the attribute name.
> something like AVOption.
nice idea I'll do that
>> +static int srv3_clean_segment_text(char *text) {
>> + char *out = text;
>> + const char *start = text;
>> +
>> + while (1) {
>> + const char *end = strstr(start, ZERO_WIDTH_SPACE);
>> + size_t cnt = end ? (size_t)(end - start) :
>> (size_t)strlen(start);
>
> strlen returns size_t, should be no need to cast it.
right thanks
> ditto for end-start;
I think pointer subtraction returns ptrdiff_t which is supposed
to be signed but it appears compilers don't complain about that
without -Wconversion (gcc doesn't even with -Wconversion which
I find amusing) since I don't see any mention of -Wconversion
in ffmpeg I can remove that cast too.
>> + } else if (!strcmp(attr->name, "ws")) {
>> + // TODO: Handle window styles
>
> use avpriv_report_missing_feature()?
>
I can do that but it'll trigger on many subtitle files as it's
pretty commonly used, just doesn't usually change all that much
which is why it's unimplemented.
>> + if(textlen == 0)
>> + continue;
>
> if (textlen...
I assume you mean if(!textlen) and aren't suggesting to put
everything into an if block?
>> +
>> +static const AVOption options[] = {
>> + { NULL }
>> +};
>> +
>> +static const AVClass srv3_demuxer_class = {
>> + .class_name = "SRV3 demuxer",
>> + .option = options,
>> + .version = LIBAVUTIL_VERSION_INT,
>> +};
>
> are you planning to use options?
> if not, options and srv3_demuxer_class can be removed.
when I wrote this code I did have in mind a hypothetical
option that would enable using non-standard libass functionality
like BorderStyle = 3 so that's a possibility but I probably won't
implement that myself anytime soon.
can a class be added retroactively? if yes then yeah this
should probably be removed.
More information about the ffmpeg-devel
mailing list