[FFmpeg-devel] Audio conversion and floating-point codecs

Sat Jul 10 10:51:06 CEST 2010

On Sat, Jul 10, 2010 at 10:37:54AM +0200, Axel Holzinger wrote:
> Hi,
> 
> > On Sat, Jul 10, 2010 at 08:23 PM, Peter Ross wrote: 
> > On Tue, Jul 06, 2010 at 03:13:26PM +0100, M?ns Rullg?rd wrote:
> > > Peter Ross <pross at xvid.org> writes:
> > > 
> > > > On Sat, May 15, 2010 at 08:17:51PM +0100, M?ns Rullg?rd wrote:
> > > >> There is a long-standing desire from some to make the 
> > > >> floating-point decoders output float samples instead of 
> > converting 
> > > >> to int16 internally, and I agree with the reasons for this.  
> > > >> However, making this change hastily will make decoding orders of 
> > > >> magnitude slower on many CPUs.  The reason is that when 
> > a decoder 
> > > >> outputs float samples, the fast asm code for 
> > float-to-int conversion is not used.
> > > >> 
> > > >> In order to change the output format of these decoders without 
> > > >> impacting performance, we must first make a few 
> > improvements to the 
> > > >> avcodec API and to the generic audio format conversion code.
> > > > [...]
> > > >
> > > >> - The decoders should output planar audio instead of 
> > interleaved for
> > > >>   multichannel streams.  This probably means introducing
> > > >>   avcodec_decode_audio4() with an AVFrame output.
> > > >
> > > > Q: does it make sense to expand the existing AVFrame 
> > structure, or 
> > > > define a new struct specific to audio?
> > > >
> > > > #define FF_MAX_CHANNELS  8
> > > > struct AVAudioFrame {
> > > >     uint16_t *data[FF_MAX_CHANNELS]; };
> > > 
> > > I've posed the same question myself, without finding a good answer.
> > > Some codecs support a huge number of channels.  I can say for sure,
> > 
> > Second contenious point:
> > At present, the user allocates the samples buffer that is 
> > handed of to avcodec_decode_audioN().
> > 
> > IMHO this is sloppy. Just look at how ffmpeg.c guesses the 
> > buffer size.
> > The alternative is to have the decoder do it, e.g. by calling
> > avctx->get_buffer() with the number the samples/channels to be output.
> > Thoughts?
> > 
> > > however, that uint16_t is the wrong data type to use here.
> > 
> > Oops. I intended int16_t.
> 
> Regarding audio sample format, wouldn't an approach be nice where the user (the one using libav...) can define the native audio sample format from a supported list (i.e. uint8_t, int16_t, int32_t, float, ...) as the default sample format that all audio functions will then use? Like a C++ template that can be instatiated with uint8_t, int16_t, etc.
> 
> I know this is a bunch of work, because it concerns so many parts in the code. But if thinking about adding more support than sole int16_t (which is a good idea I think and high time), all the possibilities should be on the table.

we already support this via sample_fmt.

(in the proposed implementation, frame->data[n] would by typecast to the
datatype used by codec->sample_fmt. e.g. 16-bit signed interlaved, 32-bit
float planar..)

-- Peter
(A907 E02F A6E5 0CD2 34CD 20D2 6760 79C5 AC40 DD6B)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100710/b9d284bd/attachment.pgp>