[FFmpeg-devel] [PATCH] SAMI demuxer and decoder.

Clément Bœsch ubitux at gmail.com
Tue Jun 12 20:23:52 CEST 2012

On Tue, Jun 12, 2012 at 01:54:25PM +0200, Stefano Sabatini wrote:
> > +#include "ass.h"
> > +#include "libavutil/avstring.h"
> > +#include "libavutil/bprint.h"
> > +
> > +typedef struct {
> > +    AVBPrint source;
> > +    AVBPrint content;
> > +    AVBPrint full;
> > +} SAMIContext;
> > +
> > +static int sami_to_ass(AVCodecContext *avctx, const char *src)
> maybe mention "sami_paragraph"

OK, renamed to sami_paragraph_to_ass()

> > +
> > +static int sami_probe(AVProbeData *p)
> > +{
> > +    const unsigned char *ptr = p->buf;
> > +
> > +    if (AV_RB24(ptr) == 0xEFBBBF)
> > +        ptr += 3;  /* skip UTF-8 BOM */
> > +    return !strncmp(ptr, "<SAMI>", 6) ? AVPROBE_SCORE_MAX : 0;
> > +}
> > +
> > +static int sami_read_header(AVFormatContext *s)
> > +{
> > +    SAMIContext *sami = s->priv_data;
> > +    AVStream *st = avformat_new_stream(s, NULL);
> > +
> > +    if (!st)
> > +        return -1;

Changed; also queued a commit to fix it in lavf/srtdec.c from where I
stole this function, as well as lavf/microdvddec.c (I'm too lazy to check
for some more ATM).

> [...]
> Overall considerations: I see you're using ff_ass_to_subrect() to
> convert SAMI paragraph -> ASS subtitle -> subrect.
> The intermediary conversion SAMI -> ASS -> subrect though looks a bit
> awkward so I would like you to shortly comment on that (e.g. why not a
> direct conversion?).

I'm just following the original design; the subrect contains the ASS
events since it's the current "standard way" of storing the subtitles
markups. Using that function is handy (it handles the duration=-1, do the
sub rect alloc, etc.). But maybe I don't understand what you really mean?

> Also I currently don't know how you're dealing with styles (SAMI seems
> use CSS for definining the style of the various elements), especially
> I don't know how FFmpeg keeps track of style in its internal
> representation (if at all). I suppose full styling support may require
> a full-featured CSS rendering engine but this would be overkill in
> this case, but would make sense to annotate the current limitations in
> a dedicated section in doc/decoders.texi.

Yes it would require a full CSS2 parser... Actually it needs an HTML
rendering engine to honor the tables and all this HTML crap
(--enable-webkit yepee~!). So right now I'm just stripping all the markups
(OTOH, the decoder supports the "speaker" feature), and it should be
enough for a start.

If you want to deal with the styles a little, you would just put the CSS
in the extradata on demuxer side, and use it in the decoder.

About doc/decoders.texti; that file looks pretty poor; do you think it's
really worth to devote more information on that format no one should
really care about?

I'm just trying here to have a half decent support of most subtitles
formats so we can have an overview of all the features we will likely
need, and create a bit of interest in the subtitles support in FFmpeg.
Given the current state, I don't think it's really important to expand on
the large amount of limitations we have at the moment :)

And looking at various other applications "supporting" SAMI, TBH, it's not
really better.

> As for what regards the parsing, I see that a bison/lex parsing tool
> may help but again this will require external dependencies and thus
> would be overkill just for this (same as an internal ff* parsing
> engine).

Actually, what would be useful is a basic XML parser; a lot of "big"
subtitles formats are SMIL-based (SAMI, RT, USF for instance). We could
add a dependency to a common (available anywhere), simple (we just need to
crawl the tags) and tolerant (did you really look at a SAMI file?) parsing
library (does anyone have one in mind?), or we could just add ~300 lines
of code in FFmpeg for this.

For now, I thought this find & match method gave good enough results (and
AFAIK that kind of heuristic is fairly common in other media players), so
I just go with it. It can be changed later.

I'm going to start working on apparently simpler formats (such as AQT)
now. When we will have enough text subtitles formats (let's say 5-6 of
different kinds), we might consider doing some general thinking such as
"what do we do for all the formats with the last-to-next-event feature?",
"do we need a XML parser?", "do we need a common subtitles format
template?", "given all these markups and features, is it worth
implementing ASS++ in FFmpeg?", etc etc. But right now, I think it's too

Clément B.
