[Ffmpeg-devel] [REQUEST] MMX/MMX2 and SSE optimizations for H.264 decoding
Loren Merritt
lorenm
Thu Sep 15 18:52:42 CEST 2005
On Thu, 15 Sep 2005, Martin Boehme wrote:
> Gamester17 wrote:
>> Yes there already are some MMX integer optimization for H264 but what about
>> SSE (Streaming SIMD Extensions) optimizations?, isn't SSE suppose to be
>> much more powerfull than MMX (and in fact be the thing that replaces MMX)?
>
> Well, for a start, SSE has registers that are 128 bits wide, while MMX's
> registers are 64 bits. As long as you're operating only on the registers
> (i.e. you're CPU-bound, not memory bandwidth limited) that's an instant
> factor of 2 speedup.
On AMD, most SSE2 instructions take exactly twice as long as the
equivalent MMX instruction. Any speedups are due only to scheduling.
In x264, we have a bunch of SSE2 functions, but most of them are _slower_
than the MMX versions on AMD.
On Intel, yes SSE2 is faster, but still not a full factor of 2 even
before you count memory bandwidth.
--Loren Merritt
More information about the ffmpeg-devel
mailing list