[Ffmpeg-cvslog] r5507 - trunk/libavcodec/flac.c

Wed Jun 21 11:32:44 CEST 2006

Loren Merritt wrote:
> On Wed, 21 Jun 2006, Justin Ruggles wrote:
> 
>>Loren Merritt wrote:
>>
>>>Overflow is not an error. For 16bit audio, if qlevel<=16, then bit #32 of
>>>the sum will never be used, no matter whether it's stored in a 64bit
>>>variable or just discarded.
>>>OK, so maybe the check should be if(s->bps + qlevel > 32)
>>
>>The thing that concerns me is that while the audio itself is guaranteed
>>to be signed 16 (or 17)-bit, the level-shifted predicted value (final
>>sum) is not.  A bad predictor can generate >32-bit values.  An encoder
>>would be unlikely to use parameters which cause this due to them not
>>giving good compression, but it is not constrained by the spec.  For
>>example, in the FLAC reference encoder there is a compile-time option to
>>check for the range of this value.  If turned on, it uses a 64-bit sum
>>and generates a warning if the value is >32-bit.  In my encoder, I
>>always use a 64-bit sum and return an error value (instead of issuing a
>>warning) to indicate not to encode with those parameters.  Neither of
>>these solutions is in the spec though.  An encoder would still be within
>>spec if it chose not to do any check, keep the predicted value & even
>>the final residual in 64-bit storage, and encode the result in the FLAC
>>file.
> 
> 
> That's all true. Nevertheless, it's still ok for the decoder to ignore 
> overflow.
> 
> FLAC does not define any clipping. If the predictor (after shifting) 
> is >16bit, then the encoder must generate a residual that brings the final 
> reconstructed sample back into the valid range of 16bit audio. Otherwise 
> it wouldn't be lossless.
> So since you know that the high order bits of the predictor and of the 
> residual must cancel, you can discard both. This is equivalent to storing 
> the reconstructed samples in an array of int16_t.
> 
> --Loren Merritt

Thanks Loren.  After a couple hours of wrapping my head around it, I
finally convinced myself that you're right. :)  The decoder will always
know the original sample depth, and as long as you clip the final
result, you'll get the right value.  But I think the check equation
should be if(s->curr_bps + qlevel > 32) because at this point the
channels have not been re-correlated, so the sample value can be 17-bit
here even when the original audio is only 16-bit.  qlevel is actually
<=15 though, so a 32-bit sum will still always work with 16-bit original
audio.

-Justin