[FFmpeg-devel] [RFC] Generic psychoacoustic model interface

Alexander E. Patrakov patrakov
Thu Aug 28 06:52:58 CEST 2008


Kostya wrote:

> On Wed, Aug 27, 2008 at 05:21:51PM +0600, Alexander E. Patrakov wrote:
<snip>
>> 3) The whole "scalefactor band lengths for long frame" business assumes
>> non-overlapping (or almost non-overlapping) bands. This is simply not the
>> case for DCA. For DCA, each subband (i.e., the entity for which one can
>> specify a scale factor [ignoring transients here]) except the first and
>> the last, has a bell-shaped form, and subbands overlap in half. I.e.
>> something like this ASCII art attempts to depict:
>> 
>> .
>>     .
>>         .
>>          .
>> ,        .
>>     ,  .
>>     .  ,
>> .       ,
>> _       ,
>>     _  ,
>>     ,  _
>> ,       _
>>         _
>>       _
>>     _
>> _
>  
> that looks a lot like AAC 8 short windows sequence
> I think when the time comes, we'll be able to adapt it for DCA

Let's check that we indeed mean the same thing. Assume the sampling rate of
48 kHz. A quantization error in the 0-th DCA subband affects frequencies
from 0 Hz to 750 Hz, with the maximum at 0 Hz. A quantization error in the
1-st DCA subband affects frequencies from 0 Hz to 1500 Hz, with the maximum
at 750 Hz. A quantization error in the 2-nd DCA subband affects frequencies
from 750 Hz to 2250 Hz, with the maximum at 1500 Hz. And so on.

And here is one more suggestion:

4) Your header provides no way for the psychoacoustical model to say
something like this: "In this range of frequencies, you can make THIS
distortion to the sum of left and right channels, and nobody will notice if
you drop the difference signal completely. Alternatively, you may change
the left channel by X and the right channel by Y".

-- 
Alexander E. Patrakov





More information about the ffmpeg-devel mailing list