[FFmpeg-devel] [RFC] SVX8 stereo files and AVCODEC_MAX_AUDIO_FRAME_SIZE

Sat May 14 11:07:24 CEST 2011

On date Saturday 2011-05-14 01:57:41 +0200, Michael Niedermayer encoded:
> On Sat, May 14, 2011 at 01:30:03AM +0200, Stefano Sabatini wrote:
> > Hi,
> > 
> > I spent some time trying to figure out how to fix issue #169,
> > currently the problem is that the code can't work for stereo files,
> > indeed audio data in an SVX8 file is contained in a single chunk, with
> > all the left samples at the begin and the right samples at the end.
> > 
> > The problem is that the chunk size is not known a-priori, and its
> > decoded size could easily exceeds the max buffer size.
> > 
> > A possible solution would be to extend the audio decoding/encoding API
> > (audio frame in AVFrame?), another would be to store the whole chunk
> > in the demuxer, decode+interleave and release audio packets with fixed
> > size, another would be to make the demuxer fails in case stereo audio is
> > detected, which can't work with the current code/framework.
> > 
> > decode+interleave in demuxer seems simple enough, and would imply to
> > dump the 8svx decoders which would be unnecessary since all the
> > decoding would be done in the muxer itself.
> > 
> > Ideas/suggestions?

[Note: edited above content to replace muxer with DEmuxer).

> The decoder could just split and return several smaller frames
> I dont know if this is the best solution but it seems the easiest
> with the least hacks

Possible solution: iff demuxer passes the whole audio chunk to the
decoder, which decodes it, interleaves and returns the first decoded
frame.

Application will have to call again avcodec_decode_audio3() with a
NULL packet until the audio chunk will be exhausted, in this case the
function will finally return 0 (and the total sum of consumed bytes
will be equal to the input packet size).

>From the avcodec_decode_audio3() docs:

 * Decode the audio frame of size avpkt->size from avpkt->data into samples.
 * Some decoders may support multiple frames in a single AVPacket, such
 * decoders would then just decode the first frame. In this case,
 * avcodec_decode_audio3 has to be called again with an AVPacket that contains
 * the remaining data in order to decode the second frame etc.

The last sentence seems to imply that the *application* has to update
the packet (how?), while there is no mention that when the stream ends
the application may continue to call avcodec_decode_audio3() with a
NULL packet to get the cached frames.

Also in this case consumed data is not put sequencially in the packet,
indeed we have:
LLLLLLLLLLLLLLLLLLLL...RRRRRRRRRRRRRRRRRRRRRR

and we return LR for each frame, so it is not feasible to just advance
the pkt.data pointer.
-- 
FFmpeg = Furious Foolish Multimedia Ponderous Excellent God