[FFmpeg-devel] FATE gradfun

Reimar Döffinger Reimar.Doeffinger at gmx.de
Wed Dec 5 20:59:06 CET 2012

On Wed, Dec 05, 2012 at 08:17:43PM +0100, Clément Bœsch wrote:
> Now in the SSSE3 version, the operation
>     m = (m * m * delta) >> 14
> is done with the rounding of pmulhrsw, which is equivalent to
>     m = (((((m*m)<<1) * delta) >> 14) + 1) >> 1

What is the point of doing it like this?
I haven't checked in detail, but it seems to me like it only adds
something like 1.5 bits of precision and the cost of mismatching
the MMX code and requiring SSSE3 instead of just SSE2.

> Concerning the MMX version, it seems there is a difference in the
> dithering: while the C and SSSE3 versions seem to use whole 8B line of
> dither, the MMX one will only use the first 4 (2 times). I'm not sure
> about how to solve that issue.

Put the second 4 into mm3, unroll the loop once?

> Any comment?

You could always use a bitexact flag to disable or modify some
of the code.

More information about the ffmpeg-devel mailing list