[Ffmpeg-devel-old] Re: [Ffmpeg-devel] Snow motion blocks

Tue Apr 19 14:43:11 CEST 2005

On Tue, 19 Apr 2005, Robert Edele wrote:

> This filter IS motion comp. What you want is for the pixels to match as
> closely as possible to what the actual pixel is, to save bits.

Exactly. So given an array of previously decoded pixels S={a0,a1,a2...,an}
one wants to predict another pixel p. Usually this is simplified into
linear problem so that we want to find set of weights {w0,w1,w2,...,wn}
so that the error ( w0*a0 + w1*a1 + ... + wn*an - p )^2 is minimized.

One wants also to predict many pixels {p0,p1,p2,...} corresponding to
several arrays of decoded pixels {S0,S1,S2,...} with the same set of 
weights (or the problem would be trivial).

Fortunately, there are efficient algorithms which solve just this
kind of problems, for example RLS (recursive least squares).
They are actually fast enough to run in real-time, which makes it
possible to compute the weights adaptively.

The weights should
- Compensate for non-integer translation
- Remove noise (ie. data that is uncorrelated between the pixel we want to
   predict and the array from which we predict)
- Correct for other distortions (eg. in one paper aliasing caused by the
   camera was found to be significant).

There has been some work done to adaptively compute the interpolation
weights, but not too much, and I'm planning to do some work on that
thing, later... one of the important questions is whether the gain from 
adaptive weights is worth the computation.

>> fixed and let one vary depending on the other. For instance compare
>> psnr at a fixed bitrate, or file size at a fixed psnr...

I wouldn't recommend using rate-control, though, as it may have some
bad effects, especially if the allowed buffer size is large and sequence
is short. So better to use fixed quantization (or whatever the equivalent 
in snow is).

> psnr went up as file size went down, and vice versa. I wouldn't have a
> clue how to separate the two, but as they both go the same way, it
> doesn't really matter.

I'd like to advocate encoding with several fixed quantization values
and plotting graph of rate versus psnr. This also shows whether a method
works better at high- or low bitrates. And then looking at the graph,
you could easily see how large improvement in psnr is required before a 
given increase in bit rate is justified.
(several examples in my homepage)

> Not exactly, but brute force doesn't require that.

That's why I like brute force :)

> Yes. You use a cubic (ax^3 + bx^2 + cx + d) interpolation to figure out
> a pixel value, using 4 coefficients as inputs, instead of two inputs and
> linear (ax + b) interpolation. It generally gets you closer to the
> 'real' value, which hopefully would mean a better motion comp match.

Just for reference, weights for
linear interpolation:   1/2   1/2
cubic interpolation: -1/16  9/16   9/16   -1/16
5th degree interpolation: 3/256 -25/256 150/256 150/256 -25/256 3/256
H.264 style interpolation: 1/32 -5/32 20/32 20/32 -5/32 1/32

One idea I had was to compute half pixel SAD value approximation
with H.264 not by interpolating image into half pixel resolution
and computing directly the SAD but instead interpolating SAD
directly using various degrees interpolation polynomials...
unfortunately bit rate was increased around 5% (at fixed PSNR).

Anyway, 5th degree interpolation polynomial and H.264 style
interpolation were virtually equally good but as H.264 coefficients
have smaller denominators, it was chosen (by the standardization 
committee) for that reason, I suppose.