[FFmpeg-devel] [RFC] Talk about subtitles

Tue Nov 22 20:52:14 CET 2011

On Tue, Nov 22, 2011 at 07:56:41PM +0100, Clément Bœsch wrote:
>  - ASS/SSA: AFAIK we use internally a slightly modified representation of ASS to
>    store all the different subtitles tracks. While now ASS is used to render
>    almost everything, it might not be the perfect approach because it can't
>    store the information of all the subtitles format (such as pts precision).
>    See next points for various related headaches.

I am quite sure we use (or did use?) the MKV representation of ASS that
does not include the pts, but pts is instead stored separately.
However we are limit to 1 ms precision there anyway.

>  - Jacosub: this is somewhat an ancestor of ASS/SSA, and it's also a crazy
>    format. If we do things properly, we might want to support it (so mplayer
>    could for instance use FFmpeg internals to replace its own old subtitles
>    support, which includes a partial jacosub support). Note that this format
>    supports the *include* of external *images*. This is also something we need
>    to care about in the internal struct. I already wrote a demuxer a while ago,
>    and it could be used.

To my knowledge the structs already allow combining text, ass and image
rects, since each rect has a separate type.
You just need to be able to "normalize" them. E.g. all-bitmap for
rendering on video, if you want to be crazy all-text (even from bitmaps)
for some other uses...

>  - SAMI subtitles: this is HTML-only-valid-with-iexplorer5 crap, with CSS and
>    such. Converting them to ASS might be difficult but possible. Though, parsing
>    HTML is a pain: since it's not the only existing XML-like subtitles, we might
>    need to have subtitles with a dependency to a random HTML/XML library...

I don't really see the bother. We could add an extra type, but just
throw away (most) of the markup. Haven't seen many people complain about
it. Doesn't hurt to give people a reason to avoid inventing yet another
format in the case of subtitles.

>  - Closed caption and teletext: I don't know them enough, but we also need to
>    support them. If anyone can comment on this...

CC is mostly just plain text. I think it has some scrolling features and
stuff, but nobody I know really cares about it.
The biggest problem really is that it nowadays often is stored in MPEG
userdata which _must_ be reordered along with the video frames before
being able to process it.
Haven't yet found a sane way of handling it in FFmpeg.
You can't really cram teletext fully into the subtitle framework since it is
(kind of) interactive, needs a page cache etc.

> This is all I can think about ATM. So if we want to support this properly, I can
> only think of this: libavsubtitles (with a dependency to libass to render a lot
> of crazy markup in subtitles).
> 
> This library will be used to demux/decode/etc (for conversion purpose too) and
> render the subtitles. And more importantly, we would be able to use it to burn
> them through the libavfilter with something like vf_subtitles.

I don't see why you would need or even want to have demux/decode in a
separate lib instead of where it is now.
You will want something to convert between the different AVRect types,
but whether that justifies a separate lib, especially if it's mostly
going to be a libass wrapper seems questionable to me.

> There is ATM a pending patch (vf_ass) that can workaround the situation for most
> of the people: a lot of software allow the convert from srt to ass, so it won't
> be needed to duplicate the srt to ass convert code yet another time for the
> filter.

Why and how would SRT even get to the filter? Subtitle AVRect to my
knowledge doesn't even support/allow SRT.