[FFmpeg-devel] [PATCH] VC-1 MMX DSP functions
Sun Jul 8 21:57:47 CEST 2007
On 7/8/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> 2007/7/8, Christophe GISQUET <christophe.gisquet at free.fr>:
> > Zuxy Meng a ?crit :
> > > I did a quick test on 64-bit K8 tonight thanks to Stephan's testbed.
> > And myself on a x86-64 core2 system.
> > > The result wasn't promising. In short, from fastest to slowest:
> > > MMX > SSE2 w/o sw pipeling > SSE2 w/ sw pipeling
> > I haven't tested the mid-performer, but I can confirm this. Using
> > START/STOP_TIMER, the figures are (on a 1080p sequence): ~2800
> > dezicycles for MMX, ~3800 for SSE2.
> I doubt if there's anything wrong. IIRC 32-bit SSE2 (w/ sw pipelining)
> is faster than MMX on your Conroe. How can it be more than 25% slower
> under 64-bit?
That's indeed surprising. The only difference that I know of on Conroe
in 64bits mode is that less micro-op fusion take place, if at all.
That shouldn't be the cause for that huge slowdown however IMHO.
More information about the ffmpeg-devel