[FFmpeg-devel] [PATCH] unscaled float 2 int conversion

Sun May 18 19:16:07 CEST 2008

On Sun, May 18, 2008 at 04:59:53PM +0200, Andreas ?man wrote:
> Michael Niedermayer wrote:
> > On Sun, May 18, 2008 at 12:12:04AM +0200, Benjamin Larsson wrote:
> >> [...]
> >>>> so how should we go forward from this when we work on 
> >>>> implementing a new audio api. The codecs should output samples in their 
> >>>> native format, that is what I think most of us agree on. But what is the 
> >>>> native format for a codec outputting samples in float when running in 
> >>>> simd mode and the same when running in non simd mode ?
> >>> SAMPLE_FMT_FLT
> >>> and
> >>> SAMPLE_FMT_FLT_BIAS_385
> >> The reason I keep bitching about this is that SAMPLE_FMT_FLT_BIAS_385
> >> output is cumbersome to use if you want to add a filter after you have
> >> decoded a codec frame.
> > 
> > I do not understand this problem. Each filter (if we ever do have audio
> > filters) supports specific formats and convertion filters would be
> > insterted as needed.
> > Only the convertion filter needs to care about SAMPLE_FMT_FLT_BIAS_385.
> > 
> > 
> >> What I propose then is that we only use the bias trick when we are
> >> outputting 16bit samples directly after the decoder.
> > 
> > Of course, thats the whole idea behind SAMPLE_FMT_FLT_BIAS_385.
> > Also the second last filter might choose to output SAMPLE_FMT_FLT_BIAS_385
> > if it knows that the next filter prefers it and converts to 16bit.
> > 
> 
> How about this: (Some of which has been discussed before)
> 
> Add
> 

> CODEC_CAP_AFRAME
> ================
> For audio-codecs this means that the codec will output/receive the data
> in a (struct AVframe). Initially no codecs will have this capability.
> In the end, all audio codecs should be converted to this.

ok but AVFrame and sample_fmt are 2 seperate things they dont belong in
the same patch, same thread on the ML or anything else. I dont even see
why you mention AVFrame, its surely a change we want but its seperate
from sample_fmt != int16

> 
> Decoders with CODEC_CAP_AFRAME will output samples in their
> native format (as opposed to int16_t) and thus, set avctx->sample_fmt
> to that format.
> 

> Encoders will set avctx->sample_fmt during init and will expect that
> samples are delivered in that format by the caller.

no, encoders will have a list of supported sample_fmts in AVCodec
the user will set sample_fmt and the encoder will return -1 if it cant
handle it.
Some codecs (especially lossless might support more than 1 sample_fmt)
Also i think we want audio and video to be consistant and not have
audio use amore limited and inconsistant way to select the format.

> 
> For float output the native range is -1.0 to +1.0.
> 8, 16 and 32 bits integers are obvious.

> I assume that 24bit would be stored as -2^23 to +2^23 in an int32_t.

SAMPLE_FMT_S24 should be removed

> 
> CODEC_CAP_SCALE_BIAS
> ====================
> This capability signals that the decoder honor the (two new) fields
> avctx->sample_scale and avctx->sample_bias (both float I think)
> in an efficient manner. (e.g. codecs could keep a local copy
> of sample_scale, and if it detected to have changed, the codec
> can recompute its coefficients, or whatever it uses).
> Thus, this could also be used to change volume of the output
> during playback.

ok, though iam not sure if bias should be a float or not maybe just a
flag.

> 
> I'm not sure if it is worth to implement scaling and bias on the
> encoder side though.

Lets worry about encoders after the decoder changes have been done ...

> 
> ...
> 
> The current versions of avcodec_decode_audio() can then be adapted to
> use these capabilities for conversion to int16_t and thus preserve
> both the external API and speed.
> Also, we'll get rid of some float2int code duplication.

No, we will NOT maintain bugs in the API to do hidden and slow convertion.
User apps which ignore sample_fmt are buggy and must be fixed. It will
not help them if we workaround these bugs by quality and speed loosing
convertion.
They have to eventually fix their buggy code and we wont do them a favor
by doing unneeded convertion.

avcodec_decode_audio() will return float/int32/int16 and set sample_fmt
appropriately.

> 
> avcodec_decode_audio() just needs access to a dsputil context somehow.
> Perhaps a (dsputil *) could be pointed to via avctx as well.
> Ugly indeed, better suggestions are most welcome :)

As there will be no convertion in avcodec_decode_audio() this problem does
not exist.

> 
> Once this is done we should expose the native sample formats directly
> via avcodec_{en,de}code_audio3() or similar function.
> 
> The step after that would be to start arranging for audio filters
> and such things.
> 

> I'm trying to come up with a solution that allows us to divide
> the work in smaller pieces and take one step at a time.
> Cause I think that's the only way we can make progress on this subject.

1. write a sample format converter, this can be simple and slow.
The API matters more than the implementation!

2. make ffmpeg/ffplay/ffserver use this converter based on sample_fmt
to convert 

3. start changing decoders one by one to set sample_fmt to non int16
and output their native format

TOTALLY independant of that is the AVFrame change
1. add CODEC_CAP_AFRAME / avcodec_decode_audio3()

2. make ffmpeg/ffplay/ffserver use avcodec_decode_audio3() when
   CODEC_CAP_AFRAME is set.

3. convert one decoder at a time to have CODEC_CAP_AFRAME set

4. remove CODEC_CAP_AFRAME, change avcodec_decode_audio3 to
   avcodec_decode_audio and bump the major version

Anyway the whole is quite trivial. I hope someone will finally start
working on something of the above instead of these endless discussions.

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080518/88670796/attachment.pgp>