[FFmpeg-devel] [RFC] AAC Encoder

Mon Aug 18 14:16:43 CEST 2008

Michael Niedermayer a ?crit :

> Besides a psychoacoustic model IMHO produces perceptual weights, either per
> bands or coefficients.
> Everything else should be done per RD theory.
> Now in principle other decissions could also be done on a psychoacoustic
> aware way but as we have seen from the quantization this is clearly not the
> case in the current model.
> What the current model does is it calculates these weights (in the form of
> scale factors) and the rest has absolutely nothing to do with psychoacoustics
> its just a trivial reference quantizer, trivial M/S selection based on better
> decorrelation, trivial IIR filter based short window selection [and this one
> is even suboptimal in its own way as it limits itself to 9 out of 128
> groupings].
> the scalefactors from the psy model should be useable as RD factors for
> weighting between rate and distortion. Iam pretty sure a relation like
> lambda = A*sf^B  with A and B constants should be more than good enough
> for our purposes, it is for mpeg4 ASP.

A psy model should return masking values (usually per band, but not 
necessarily the same bands as the scalefactor bands). This masking value 
is used to compute perceptual distortion, and of course a classical RD 
cost can then be computed.
Scalefactors themselves should not be computed only by the psymodel, but 
should be selected by some RD loop (the direct scalefactor computation 
from the 3gp model is an heuristic to avoid a full search, but should 
not totally replace the scalefactor search).

-- 
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org