[Ffmpeg-cvslog] r5507 - trunk/libavcodec/flac.c
Justin Ruggles
jruggle
Wed Jun 21 11:32:44 CEST 2006
Loren Merritt wrote:
> On Wed, 21 Jun 2006, Justin Ruggles wrote:
>
>>Loren Merritt wrote:
>>
>>>Overflow is not an error. For 16bit audio, if qlevel<=16, then bit #32 of
>>>the sum will never be used, no matter whether it's stored in a 64bit
>>>variable or just discarded.
>>>OK, so maybe the check should be if(s->bps + qlevel > 32)
>>
>>The thing that concerns me is that while the audio itself is guaranteed
>>to be signed 16 (or 17)-bit, the level-shifted predicted value (final
>>sum) is not. A bad predictor can generate >32-bit values. An encoder
>>would be unlikely to use parameters which cause this due to them not
>>giving good compression, but it is not constrained by the spec. For
>>example, in the FLAC reference encoder there is a compile-time option to
>>check for the range of this value. If turned on, it uses a 64-bit sum
>>and generates a warning if the value is >32-bit. In my encoder, I
>>always use a 64-bit sum and return an error value (instead of issuing a
>>warning) to indicate not to encode with those parameters. Neither of
>>these solutions is in the spec though. An encoder would still be within
>>spec if it chose not to do any check, keep the predicted value & even
>>the final residual in 64-bit storage, and encode the result in the FLAC
>>file.
>
>
> That's all true. Nevertheless, it's still ok for the decoder to ignore
> overflow.
>
> FLAC does not define any clipping. If the predictor (after shifting)
> is >16bit, then the encoder must generate a residual that brings the final
> reconstructed sample back into the valid range of 16bit audio. Otherwise
> it wouldn't be lossless.
> So since you know that the high order bits of the predictor and of the
> residual must cancel, you can discard both. This is equivalent to storing
> the reconstructed samples in an array of int16_t.
>
> --Loren Merritt
Thanks Loren. After a couple hours of wrapping my head around it, I
finally convinced myself that you're right. :) The decoder will always
know the original sample depth, and as long as you clip the final
result, you'll get the right value. But I think the check equation
should be if(s->curr_bps + qlevel > 32) because at this point the
channels have not been re-correlated, so the sample value can be 17-bit
here even when the original audio is only 16-bit. qlevel is actually
<=15 though, so a 32-bit sum will still always work with 16-bit original
audio.
-Justin
More information about the ffmpeg-cvslog
mailing list