[FFmpeg-devel] h264 speed regression after PAFF

Uoti Urpala uoti.urpala
Sat Oct 13 05:26:47 CEST 2007

On Sat, 2007-10-13 at 03:08 +0200, Michael Niedermayer wrote:
> On Sat, Oct 13, 2007 at 01:57:44AM +0300, Uoti Urpala wrote:

> > Patch 1 adds av_noinline to some functions in dsputil_mmx.c.
> this seems to just affect mpeg4 ASP related functions not h.264

Seems to be true for the current version at least. Maybe it originally
did something more but I left parts out when updating it for a newer
FFmpeg version. It still seems to have a measurable positive effect, but
that could be just random and specific to the exact state of other code
at the moment (I haven't benchmarked it alone for a long time). Since
it's in the same compilation unit as h264 code it could have a more
consistent effect though.

> > Patch 2 marks various h264 functions as either av_always_inline or
> > av_noinline. Many of those changes probably have no effect; I didn't try
> > to minimize the amount of changed functions.
> this looks very interresting, someone though should split it and benchmark
> each change individually

Exact effect will likely depend on gcc version and might not be the sum
of individual changes, so separating worthwhile changes will be hard.
Earlier gcc-4.2 versions did noticeably worse without that patch (or
might have been earlier FFmpeg code, but I think compiler is more likely
the cause). Current version apparently makes some of the same decisions
by default.
> > Patch 3 cleans up some of the asm in cabac.h (the HAVE_FAST_CMOV case I
> > use on my own machine). It's not primarily intended as a speedup patch
> > but did seem to make the code a bit faster. It adds proper dependencies

I left out one detail I meant to add: this code will likely have the
"missing operand" asm syntax issue which breaks the old assembler used
on OS X. The previously discussed workarounds for that could be used.

> this one breaks gcc 2.95 so it cannot be used in its current form

I think trying to port the patch to support such an obsolete compiler
would be waste of time. If someone has time to waste feel free to write
a version which even 2.95 can compile if you're able to...

> also ive seen some tabs in there which arent allowed in ffmpeg svn

The original patch version removed most of the code not used on my
processor from the file. I cleaned it up enough not to do that but
didn't try to fix details like tabs etc. And if you still value gcc-2.95
support more than code cleanness, efficiency or portability (this
version should work with icc) then it's probably pointless to clean it
to committable form.

More information about the ffmpeg-devel mailing list