[FFmpeg-devel] support for audio with sample resolution better than16bit

Sun Apr 20 12:13:19 CEST 2008

Moin moin Lars,

Lars T?uber wrote: 
> Hallo!
> 
> Recently I added a wish (#425) to the ffmpeg roundup system.
> I'd like ffmpeg to be capable to transcode 24bit pcm into 24bit
flac.
> 
> Therefore the audio handling has to be enhanced. Now I want 
> to make some suggestions how I think this could be done.
> 
> At first the decoding needs to be able to handle other 
> resolutions than 16bit.
> I'd suggest to add the following function:
> 
> int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx,
>                                               int16_t *samples,
>                                               int *frame_size_ptr,
>                                               void *low_samples,
>                                               int
*low_frame_size_ptr,
>                                               uint8_t *sample_res,
>                                               const uint8_t 
> *buf, int buf_size)
> 
> When there is audio decoded in 24 bit depth the caller can 
> tell that he only wants to receive 16bit depth with setting 
> *sample_res=16.
> Then low_frame_size_ptr and low_samples can be uninitialized.
> when 16 < *sample_res <=24  low_samples is of type (unint8_t 
> *) and the memory has to be allocated when 24 < *sample_res 
> <=32  low_samples is of type (uint16_t *) and the memory has 
> to be allocated when 32 < *sample_res  return error
> 
> audio samples are always put into the high bits:
> 26 bit sample:
> int32_t real_sample[0] = samples[0]<<16 | ((uint16_t *) 
> low_samples[0] & 0xfffc)
> 
> after decoding *sample_res is set to the actual sample 
> resolution e.g. the caller tells he wants 24bit maximum but 
> the stream only had 20bit resolution then *sample_res is set to 20
> 
> When the caller sets *sample_res == 0 he accepts all depth up 
> to 32bit.
> 
> Optionaly the samples pointer could also be of type (void *) 
> and then *samples_res could distinguish 16bit samples from 
> 8bit samples.
> 
> What do you think?
> Lars

Isn't that a bit too complicated?

There isn't something much more simple than linear audio. A sort of
codec transforming a bunch of audio samples with resolution A into
resolution B is a very easy job (okay, if you want to do a
downconversion right, you need to add dither).

If you want to add a complete new function avcodec_decode_audio3, why
not just using a simple signature like
avcodec_decode_audio3(AVCodecContext *avctx,
                      int16_t *samples,
                      int *frame_size_ptr,
                      uint8_t *sample_res,
                      const uint8_t *buf,
                      int buf_size)

Any module that needs a specific linear format can then utilize a
conversion routine that takes the 8/16/20 (is 20 really necessary, one
can always use 24 instead) /24/32 format and convert it to that needed
specific format. IMHO this approach leads to a much more clear design.

Besides that, did you consider IEEE float (additionally) also?

Cheers
Axel