[FFmpeg-devel] [PATCH] h264/aac in flv

Fri May 23 05:12:27 CEST 2008

Michael Niedermayer wrote:
> On Wed, May 07, 2008 at 04:39:05PM +0200, Baptiste Coudurier wrote:
>> Michael Niedermayer wrote:
>>> On Wed, May 07, 2008 at 03:38:02PM +0200, Baptiste Coudurier wrote:
>>>> Michael Niedermayer wrote:
>>>>> On Wed, May 07, 2008 at 12:36:42PM +0200, Baptiste Coudurier wrote:
>>>>>> Hi Michael,
>>>>>>
>>>>>> Michael Niedermayer wrote:
>>>>>>> On Tue, May 06, 2008 at 07:31:53PM +0200, Baptiste Coudurier wrote:
>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>  static int get_audio_flags(AVCodecContext *enc){
>>>>>>>>>>>>>>      int flags = (enc->bits_per_sample == 16) ? FLV_SAMPLESSIZE_16BIT : FLV_SAMPLESSIZE_8BIT;
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> +    if (enc->codec_id == CODEC_ID_AAC) // specs force these parameters
>>>>>>>>>>>>>> +        return FLV_CODECID_AAC | FLV_SAMPLERATE_44100HZ | FLV_SAMPLESSIZE_16BIT | FLV_STEREO;
>>>>>>>>>>>>> Is this also correct for AAC streams for which these arent true? Or are
>>>>>>>>>>>>> such streams just not supported?
>>>>>>>>>>>>>
>>>>>>>>>>>> Streams are supported (like mp3 48khz btw), and play well. Like written,
>>>>>>>>>>>> specs mandates these values.
>>>>>>>>>>> I know but what about lets say 48khz AAC or 22khz AAC your code would mux this
>>>>>>>>>>> with a claimed samplerate of 44khz.
>>>>>>>>>>> Is such 22khz AAC and 48khz AAC legal in flv accoridng to spec or is just
>>>>>>>>>>> 44khz AAC allowed?
>>>>>>>>>>> If later then the muxer should reject AAC with samplerates differing from 
>>>>>>>>>>> 44khz.
>>>>>>>>>>>
>>>>>>>>>> Rofl:
>>>>>>>>>> "Sampling rate
>>>>>>>>>> For AAC: always 3"
>>>>>>>>>>
>>>>>>>>>> Is this ok for you ?
>>>>>>>>> Ive not doubted that this has to be set to 3. The question is if
>>>>>>>>> 48khz/22khz AAC is allowed in flv or not. If one takes the spec literally
>>>>>>>>> then the awnser is clearly no.
>>>>>>>> Well, I would say it is probably yes, considering aac and
>>>>>>>> AudioSpecificConfig.
>>>>>>>>
>>>>>>>>> But your muxer would store them blindly
>>>>>>>>> with a claimed sample rate of 44khz.
>>>>>>>> Well, what can I say ? It's not clearly mentioned.
>>>>>>>> I fear they used the same crap as mp4, AudioSpecificConfig mentions the
>>>>>>>> correct sample rate and channels number.
>>>>>>>> This is the case for channels too:
>>>>>>>>
>>>>>>>> "SoundType UB[1]
>>>>>>>> 0 = sndMono
>>>>>>>> 1 = sndStereo
>>>>>>>> Mono or stereo sound
>>>>>>>> For Nellymoser: always 0
>>>>>>>> For AAC: always 1"
>>>>>>>>
>>>>>>>> Im in favor of muxing this way.
>>>>>>> Iam ok with that if the official software also generates such 22/48khz AAC.
>>>>>>>
>>>>>> Updated patches considering signed ctts offset.
>>>>>>
>>>>>> Now there is a problem with timestamps, also with mov (which use signed
>>>>>> offsets), because pts can be < dts.
>>>>> dts is the time at which a packet enters the decoder
>>>>> pts is the time at which the correspoding decoded packet leaves the decoder
>>>>>
>>>>> Thus dts <= pts
>>>>>
>>>>>
>>>>> Also quoting  ISO/IEC 14496-12 Second edition 2005-04-01 Corrected version 2005-10-01:
>>>>> ---
>>>>> This box provides the offset between decoding time and composition time. Since decoding time must be less
>>>>>                                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> than the composition time, the offsets are expressed as unsigned numbers such that CT(n) = DT(n) +
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> CTTS(n) where CTTS(n) is the (uncompressed) table entry for sample n.
>>>> FYI, this will be soon obsoleted by an amendment,
>>> link?
>> Sent the draft I have on hands.
> 
> thanks
> I hope this draft is rejected, its insane.
> The problem is that the first dts in mov/mp4 is implicitly 0.
> Which doesnt work out because that just isnt always the case. Solution
> would be to specify this first dts.
> 
> What they do is change the pts-dts offsets to signed so the pts are correct
> while the dts are no longer correct. (they also add various optional fields
> to allow one to guess the correct dts but optional == useless)
> 
> The obvious implementation is to leave dts at AV_NOPTS_VALUE when there
> is a version 1 ctts. We could also attempt to guess the dts but this is
> not always possible. IP mpeg2 vs low delay IP mpeg2 being an example
> which is ambigous.
> 

Well, I'll just let current code (ignore negative ctts) for mov/mp4 as is.
It seems a full H264 header parser is becoming really needed to
correctly compute pts (it seems gpac is good for raw h264 streams, it
handles bpyramid quite nicely).

In the meantime we can still generate flv with positive dts by adding
delay.

Demuxer still handles cts as signed, this should cause no real harm,
except maybe some user complaining and I think Im ok with this, it will
remind me to think of a correct and good solution ;)

Updated patch attached.

-- 
Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Smartjog USA Inc.                                http://www.smartjog.com
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h264_flv.patch
Type: text/x-diff
Size: 9920 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080522/d224b922/attachment.patch>