[FFmpeg-devel] [RFC] Generic psychoacoustic model interface

Kostya kostya.shishkov
Thu Aug 28 19:17:37 CEST 2008


On Thu, Aug 28, 2008 at 10:52:58AM +0600, Alexander E. Patrakov wrote:
> Kostya wrote:
> 
> > On Wed, Aug 27, 2008 at 05:21:51PM +0600, Alexander E. Patrakov wrote:
> <snip>
> >> 3) The whole "scalefactor band lengths for long frame" business assumes
> >> non-overlapping (or almost non-overlapping) bands. This is simply not the
> >> case for DCA. For DCA, each subband (i.e., the entity for which one can
> >> specify a scale factor [ignoring transients here]) except the first and
> >> the last, has a bell-shaped form, and subbands overlap in half. I.e.
> >> something like this ASCII art attempts to depict:
> >> 
> >> .
> >>     .
> >>         .
> >>          .
> >> ,        .
> >>     ,  .
> >>     .  ,
> >> .       ,
> >> _       ,
> >>     _  ,
> >>     ,  _
> >> ,       _
> >>         _
> >>       _
> >>     _
> >> _
> >  
> > that looks a lot like AAC 8 short windows sequence
> > I think when the time comes, we'll be able to adapt it for DCA
> 
> Let's check that we indeed mean the same thing. Assume the sampling rate of
> 48 kHz. A quantization error in the 0-th DCA subband affects frequencies
> from 0 Hz to 750 Hz, with the maximum at 0 Hz. A quantization error in the
> 1-st DCA subband affects frequencies from 0 Hz to 1500 Hz, with the maximum
> at 750 Hz. A quantization error in the 2-nd DCA subband affects frequencies
> from 750 Hz to 2250 Hz, with the maximum at 1500 Hz. And so on.

well, it's psy model implementation that will decide how to treat those subbands
main principles stay the same 

> And here is one more suggestion:
> 
> 4) Your header provides no way for the psychoacoustical model to say
> something like this: "In this range of frequencies, you can make THIS
> distortion to the sum of left and right channels, and nobody will notice if
> you drop the difference signal completely. Alternatively, you may change
> the left channel by X and the right channel by Y".

like joint stereo coding in MPEG Audio Layer I-III where high frequencies
are coded one time for both channels?
and changing actual data belongs to audio preprocessing

> -- 
> Alexander E. Patrakov




More information about the ffmpeg-devel mailing list