[FFmpeg-devel] [augustus at linuxhardware.org: SSE4 and FFMPEG]

Guillaume Poirier gpoirier
Tue Oct 30 13:54:56 CET 2007


This is gonna be a little off-topic, but well...

> ----- Forwarded message from "Kris Kersey (Augustus)" <augustus at linuxhardware.org> -----
> From: "Kris Kersey (Augustus)" <augustus at linuxhardware.org>
> Date: Mon, 29 Oct 2007 14:32:20 +0000 (UTC)
> To: diego at biurrun.de
> Subject: SSE4 and FFMPEG
> Diego,
> I am writing an article about the new "Penryn" Intel processor for 
> LinuxHardware.org.  Can you or another member of the FFMEG team comment on 
> what impact SSE4 will have on FFMPEG and whether there are any plans to 
> write SSE4 optimized code?  Thank you for your time.

IMVHO, SSE4 alone won't "revolutionize" SIMD optimization on x86. It's
more the combination of SSE4 + single-pass shuffle unit (aka Super
Shuffle) that comes with Penryn cores.
The thing is: in pre-Penryn core suffered from very expensive shuffle
operations, that, in some cases, would make the vectorized version of
a code bring very little speed-up compared to the scalar code.
Altivec programmers have seen thing during the MacPPC->Macintel
transition: with Altivec, suffle/permutations were almost free,
whereas on x86, not at all.

I expect Penryn to offer a much better consistent speed-up when
someone takes the time to vectorize come code.


More information about the ffmpeg-devel mailing list