[FFmpeg-devel] AAC psychoacoustic model suggestions?
Gabriel Bouvigne
bouvigne
Thu Jun 19 19:51:46 CEST 2008
Kostya a ?crit :
> I know such words as ATH, Bark, GPsycho, AoTuV and
> ISO 13818-7 Annex C.
>
> Can you give more tips/suggestions/whatever on
> psychoacoustic model implementations worth trying.
> I can have several models implemented, so good ideas won't
> be thrown away :)
I'd strongly suggest you to check the 3gpp AAC reference code, and its
associated docs:
http://www.3gpp.org/ftp/Specs/html-info/26-series.htm
It is a quite clean encoder (compared to messy ones like Lame), and the
docs provide some good introductions.
You might notice that this encoder doesn't bother about tonality
estimation. While this is relatively unusual, it allows to have a
simpler model, while reducing the risk of errors due to heuristic
failures. I'd suggest you to also not bother implementing tonality
estimation at first glance.
Regarding ISO 13818-7, you have to know that while it provides a
suggestion of a psy model, it is far from providing a description of a
GOOD psymodel. You can read it, understand it, but trying a direct
algorithm to code transcription would probably be a waste of time.
Regarding Lame, we switched from the initial GPsycho model to NSPsytune
a few years ago. Main differences between both are:
*Tonality estimation: Gpsycho uses predictability measure, while
NSPsytune uses spectral flatness
*NSPsytune uses additive masking. I'd suggest you to not bother with
additive masking, which is full of potential traps
It seems to be that what you would have to implement is (unordered list):
* Spreading(even a simple one)
* Computation of quantization error compared to masking, in order to use
it within the potential quantization loop (a bit of perceptual RD?)
*LR or MS decision
* Block switching decision (beware: you should do it in advance for the
next frame in order to know the block type of the current frame)
* A lowpass would not hurt
* probably some kind of dropout/spectral hole prevention
Of course, there are many more things that could also be implemented
latter (multichannel, pns,...)
--
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org
More information about the ffmpeg-devel
mailing list