[Ffmpeg-devel] improving encoding (possibly big perceptual gains)

Mon Jan 2 19:56:59 CET 2006

On Mon, Jan 02, 2006 at 07:37:40PM +0100, Michael Niedermayer wrote:
> Hi
> 
> On Wed, Dec 28, 2005 at 04:09:32PM -0500, Rich Felker wrote:
> > with all the recent discussions over improving lavc, i've got some
> > ideas. mainly i'm worried that psnr is very misleading when trying to
> > improve motion estimation decisions since often a particular motion
> > vector will give better psnr but much worse actual quality due to loss
> > of small detail and mud/blocking.
> > 
> > because of this, i'd like to suggest that lavc be extended to measure
> > the 'uqi' (universal [image] quality index) proposed by the paper at
> > this site:
> > 
> > http://www.cns.nyu.edu/~zwang/files/research/quality_index/demo_lena.html
> > 
> > i think the images there speak for themselves regarding how stupid
> > psnr is as a judge of quality.
> > 
> > however, printing statistics isn't much use if you can't actually
> > improve encoding. so the main idea is too add a *cmp function to
> > measure the uqi and use this for motion estimation and optimal
> > quantization and qprd, etc. maybe also the ratecontrol engine itself.
> > this should do a much better job of telling when it's ok to use poor
> > quantization, zero ac coeffs, etc. and when it will look like shit
> > (like the last lena picture ;).
> 
> theres a problem with using this quality meassure for any decission,
> look at it, it pretty much ignores brightness changes, not only will

It doesn't. There are 3 factors: correlation, brightness, and
contrast. If you think the brightness and/or contrast are weighted too
low for use in encoding decisions you can apply an appropriate
exponent to them.

> that ignore flickering/blinking but if done per block brightness will
> be off per block this will not look good at all -> PSNR is still the
> better cmp function for these decissions, maybe it should be done in
> dct domain and different frequencies should be weighted differently
> maybe a closer to 0 error should be considered less wrong then a farther
> away from zero error ...

IMO this does not address the problem I'm talking about. A motion
block with correlaton perfect but brightness way off is still very
good since a single DC coefficient can encode the residue perfectly
and no detail will be lost. PSNR-based motion estimation will often
choose a solid or near-solid block when there's very little detail,
resulting in the same kinda ugliness that we've been seeing. (The
coefficients of the residue are very small and high-frequency so they
get quantized to 0..)

Rich