[FFmpeg-devel] AAC rtp_parse_packet RFC3640 compliance

Fri Apr 24 14:55:38 CEST 2009

Hi,

Jorge Pedroso wrote:
> I've come to know that rtpdec.c rtp_parse_packet isn't quite RFC3640 
> compliant when handling AAC packets. Namely, rtp_parse_mp4_au instead of 
> gathering each AU header it concatenates them into a big one and returns 
> only that AU header.
rtp_parse_mp4_au() works in a funny way, because it does not returns
AAC frames... It simply returns an integer number of AAC frames, assuming
that a "parser" (see libavcodec/aac_ac3_parser.c) will split this in
frames later.
This is not the simplest way to go, but I would not say it is not standard.


> rtp_parse_mp4_au says:
>> /* XXX: We handle multiple AU Section as only one (need to fix this 
>> for interleaving) In my test, the FAAD decoder does not behave 
>> correctly when sending each AU one by one but does when sending the 
>> whole as one big packet...*/
I do not know the details about this problem. My feeling is that there
was some unrelated bug in the RTP demuxer which was creating problems
to faad. This might be fixed now; you can try and report your findings.
Or... Maybe the problem is that rtp_parse_open() "forgets" to set
st->need_parsing = AVSTREAM_PARSE_FULL for the AAC case?
Try setting it, and the problem should be fixed.

> OK, this works for FAAD but breaks almost everything else, right?
No, because of the "parser" mentioned above: it should automagically take
care of splitting this large amount of data in proper AAC frames.

> Moreover, av_read_frame documentation seems to me a bit misguiding when 
> saying:
>> If the audio frames have a variable size (e.g. MPEG audio), then it 
>> contains one frame.
> This is true when reading a 3GP file from disk but false when reading 
> from the RTP stream.
Did you try it? My feeling is that this should always be true. If it's
not, then you found a bug (the parser is not working, or it's not
correctly invoked... Try setting st->need_parsing as mentioned above).

> When reading from the RTP stream av_read_frame 
> returns the RTP payload minus the AU header(s).
It should not. rtp_parse_packet() returns N frames together, but
av_read_frame() should invoke the pearser and return a single frame.

> Is it safe/realistic to 
> assume that a big AU Data Section trimmed from the RTP payload packet is 
> a valid AAC audio frame?
As far as I know, no.


				Luca