[FFmpeg-devel] [PATCH] metadata conversion API

Sun Mar 1 01:06:55 CET 2009

On 2/28/2009 3:30 PM, Michael Niedermayer wrote:
> On Sat, Feb 28, 2009 at 02:24:45PM -0800, Baptiste Coudurier wrote:
>> On 2/28/2009 1:43 PM, Michael Niedermayer wrote:
>>> On Sat, Feb 28, 2009 at 01:24:41PM -0800, Baptiste Coudurier wrote:
>>>> On 2/28/2009 5:57 AM, Michael Niedermayer wrote:
>>> [...]
>>>>>>> * more compatibility for apps, apps already can through 
>>>>>>> AVOptions set and get by name and enumerate fields.
>>>>>> AVOptions uses OPT_<type> isn't it ? Why don't you want to
>>>>>> apply this to AVMetadata ?
>>>>> i explained it already above:
>>>>>>> [...] This has the advantage that it can be muxed in
>>>>>>> containers that do not support storing such information.
>>>>> [...]
>>>>>>> Or how would you store these types? If they are lost on
>>>>>>> remuxing or their types are randomized then they arent
>>>>>>> particularely usefull IMHO
>>>> Well, they are useful to gather information, print metadata and 
>>>> debugging, maybe less useful for remuxing inter-container, however,
>>>> mov to mov could end in a pretty accurate way.
>>>>
>>>> Exporting all information using AVFormatContext fields will lead to
>>>> an huge struct.
>>> exporting all fields as name value pairs will consume more memory 
>>> what i mean is, an int needs 4 bytes a av_malloc() will need at least
>>> 16 byte due to alignment alone but i would not be surprised if it
>>> needs twice that to keep track of things so malloc& free work now
>>> each metadata tag contains 2 strings, if we assume both fit in the 16
>>> byte and no additional byte is needed to keep track of things then 
>>> theres 16 bytes for ther struct (2 8byte pointers on 64bit archs) 
>>> +16*2 byte for the 2 strings thats 48 byte in the very best case, in
>>> reality it will need more
>>>
>>> that makes it 4 byte for a field in a struct and that could be
>>> reduced to 1 byte for many things and 48 byte for a name-value tag
>>>
>>> that means you need 12 times more unused fields than fields in the 
>>> struct just to make the name-value tags need the same amount of
>>> memory. and if its about 1 byte fields the factor is 48 instead of
>>> 12.
>>>
>>> maybe this explains why i disslike them so much * slow to access *
>>> memory hungry
>>>
>>> also there union{} and one can place structs in structs to make the
>>> source clearer than a monolithic struct.
>> I believe the code to print everything would be a pain in the ass.
>> You will have to iterate over all fields. I think it is worth to
>> consider easy access.
>>
> 
>> It's like to code handling lang mechanism in new metadata API, it is
>> _ugly_, implementation prove it.
> 
> huh?
> if i compare the current metadata API with what was originally proposed
> then the original with seperate lang field was much uglier and much bigger
> loc wise

That is your opinion, when I see the code splitting lang from the tag
name in the muxer, and the code to append the lang to the tag, I find it
ugly.

>> Also I believe this would simplify adding support for libx264
>> commandline switches in libx264 wrapper, since you do not needed an API
>> extension in AVCodecContext, for it due to x264_parm_parse which takes
>> exactly 2 char *. This is what I call generic API.
> 
> the question is if we _WANT_ to let a encoder bypass the normal way to
> pass parameters. 

I do. Is anybody against this ?

> If the awnser to this is yes, its a matter of a single char* generic_param
> in AVCodecContext with which encoders can do what they want.

Encoders will doxygen the parameters it supports when declaring it.
For libx264 we would point to libx264 documentation of course.

Btw, it would be good to know which option encoder honors, currently
user do not know this. This is not practical IMHO.

> If the awnser OTOH is no then the name-value tags wont help you, you still
> would need a common documented set and do translation on some ...
> 
> also the first step is to ask the question on ffmpeg-dev if we _WANT_ to
> let a encoder bypass the normal way to pass parameters.
>
> the second step is to actually have someone work on the libx264 wraper,
> i mean you claim that the name-value lists would help code that is crap
> because noone works on it not because anyone would have rejected some
> patch or because there would be a difficult problem.

I claim that having "tag"/"value" pairs would extend functionnalities
and permit users to use options in libx264 without extending API, and
would need far less maintaining.

One time work. This would greatly enhance intercompatibility with x264
cli and ffmpeg cli, and will remove fields only used for libx264.

We could use AVOptions for this but their names would need to be the
same as libx264 names, and we would need to translate AVOptions to char
* to pass to x264_param_parse.

I know for sure, that some people find the mapping of FFmpeg options to
libx264, messy.

This will apply to libmp3lame also, many people asking API extension to
support latest options.

I believe this would also permit muxers to get specific parameters which
are really only specific to them, like mxf, mp4, mkv.

>> On the other end, you could explicitly instruct the demuxer to not
>> populate metadata, this would save more bytes than your solution, since
>> the context will be smaller.
> 
> that assumes that the fields are all useless for the users task, this is
> extreemly unlikely
> if its about true metadata like Author, that exists in a name-value list
> alraedy and could, would someone actually care be disabled

Needed values will be always exported because values are needed, obviously.

> the other fields that we talk about (everything that exists as a field in
> AVCodecContext currently) arent that useless that you would achive the
> 12-48 times more unused than used factor IMHO.

Scenario for encoding and muxing is a lot simpler IMHO.

- smaller AVCodecContext and AVFormatContext.
- Generic API which can be extended easier. (code in muxer/encoder and
API bump)

Also, why not using fields for "metadata" (author, genre, track, comment
...) instead of "tag" strings,  if you claim that this would be more
efficient speed and memory wise ?

If we use "tag" string there must be a reason, and I'd like to know why
you think this isn't suitable for anything else than author, comment, etc...

-- 
Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
checking for life_signs in -lkenny... no
FFmpeg maintainer                                  http://www.ffmpeg.org