[FFmpeg-devel] [PATCH] attachments support in matroska demuxer

Wed Jan 2 05:49:28 CET 2008

On Wed, 2008-01-02 at 04:31 +0100, Michael Niedermayer wrote:
> On Wed, Jan 02, 2008 at 03:47:40AM +0200, Uoti Urpala wrote:
> > On Tue, 2008-01-01 at 13:27 +0100, Michael Niedermayer wrote:
> > > On Tue, Jan 01, 2008 at 01:17:57PM +0300, Evgeniy Stepanov wrote:

> Well, if the demuxer cannot know which streams need an attachment then so
> the matroska file cannot be remuxed with a subset of the streams unless you
> also include all (likely) unneeded attachments.

AFAIK this is the case - you cannot know which attachments decoders
might be able to use based on container-level information only.

>  That would be a serious
> design flaw and hardly one upon which we should base our API.

API for doing what? Libavformat should provide access to the contents of
Matroska attachments. What other things should the same API be
generalized to do? Hypothetical attachments in other containers?
Restrict it to less than general attachment handling for subtitle fonts
only, but generalize that to hypothetical fonts in other containers?

> Also speaking of redundancy between streams, fonts are not special here
> extradata will often be identical or similar between streams of the same
> codec.

Extradata may happen to be identical, but subtitles can refer to "the
same font" in a stricter sense. If you demux a .mkv file into separate
files, a natural format is to have an audio file (or files), a video
file, a .ssa/.ass file (or files) and a bunch of font files. Nobody
splits video/audio tracks into separate "standard extradata" and "main
content" parts.

> > > > Should I add a fake stream with
> > > > attachments as extradata ? 
> > > 
> > > this does not sound reasonable ...
> > 
> > I think this would be the least bad fit if you want to force the extra
> > resources into existing libavformat data structures.
> 
> Its not so much existing libavformat data structures, than existing container
> formats. I can add a new data structure in libavformat, but how is the avi, 
> asf, mov, ... file generated from it going to work?

You may not be able to automatically remux one from mkv, but AFAIK there
is currently no spec which would give corresponding functionality in
those containers with any method (independently of whether you want to
create the file by remuxing from mkv or from scratch). Attaching the
fonts during muxing or not is also a separate decision which is not
necessarily related to the contents of the subtitle track. And you
should also consider muxing mkv.

> > > The only way to see fonts as special relative to global data (=extradata)
> > > would be if one argues that a subtitle decoder outputs UTF-8 + position
> > > + font name + color + size + ...
> > 
> > You can't make an exhaustive list of properties easily. Position cannot
> > be expressed with simple coordinates for some fixed part of rendered
> > text unless the decoder knows the exact rendered size in advance. There
> > can be rotated text, different kinds of border, and various special
> > effects (more of which can appear in the future).
> 
> The data structure isnt cast in stone and can be extended as new things appear
> in the future.
> Any difficulty in making an exhaustive list exists for writing a (needed)
> subtitle decoder/renderer already. If you cant do that you cant decode the
> subtitle either way.

The output format of such a "decoder" would need to be so high level
that it wouldn't so much be decoding as translating to another very
general subtitle format.

> > > but most important is that we need generic solutions, not something which
> > > can only be used with mkv-ass (= no hacks)
> > 
> > No other container can properly handle SSA/ASS with fonts AFAIK, so this
> > requirement is rather vague. In practice no way of handling fonts (or
> > other extra resource files) can be used with other containers before
> > there's a spec on how to store such data in those containers. I think

> Simply duplicating the fonts in extradata would work (with respect to fonts)
> with most containers and a libavcodec based ASS/SSA decoder (->it also would
> work in all libavcodec based players even ones not using libavformat)

Duplicating the fonts can be expensive if there are lots of subtitle
tracks (which there can be as they're otherwise small). In addition to
the memory directly used by the font files there are also whatever data
structures parsing and rendering them requires - duplicating the fonts
in the extradata of each track would force a decoder to do extra work
comparing them to go back to the shared representation if it wants to
avoid that overhead. And what about other Matroska attachments?
Shouldn't there be a general way to access them? I think it would be
better to create an API for accessing Matroska attachments (generalized
to attachments in other containers if some get such functionality), and
leave issues specific to fonts and their relation to subtitle tracks
outside libavformat for now.