[FFmpeg-devel] [PATCH] VP8 arithcoder asm

Reimar Döffinger Reimar.Doeffinger
Sun Jul 4 12:14:49 CEST 2010


On Sun, Jul 04, 2010 at 02:25:18AM -0700, Jason Garrett-Glaser wrote:
> This is rather odd, considering that the code looks a whole lot better
> than what gcc generates, so there must be something stalling my code
> that I'm missing, assuming my numbers are right.  It couldn't possibly
> be the extra pushes and pops implied by an extern call -- because at
> least for me, calling the vp56_rac asm function repeatedly instead of
> the merged tree function is actually faster, despite vastly more stack
> thrashing.

Maybe it causes the compiler to mess up completely in surrounding code?
Sometimes the compiler manages to optimize code pieces together that
are quite far apart, any kind of code it cannot see might confuse it.
Also, which compiler version do you use? Because e.g. your cmov-related
"magic" does not work at all for me with gcc 4.4.4 and compiling for
Phenom II, it always generates branches...



More information about the ffmpeg-devel mailing list