[FFmpeg-devel] [PATCH] x86/dsputilenc: optimize sum_abs_dctelem functions

James Almer jamrial at gmail.com
Sun May 25 08:34:04 CEST 2014


On 25/05/14 3:18 AM, Christophe Gisquet wrote:
> Hi,
> 
>> I originally tested this on an SSE2 only machine, so i didn't see the hit
>> on the SSSE3 version. Sorry about that.
> 
> On the other hand, the difference is within measure noise, I expect, and is
> below most actual effects. Anything below 0.5 cycles seems inconsequential
> to me.
> 

True, but it's still a performance hit with no gain of any kind to go alongside it.

Now, if making the macro uglier is acceptable, i think making this change only for 
the non SSSE3 versions of the function would be nice. Especially the SSE2 one, which 
will stop clobbering xmm6 on Win64.

> But nevermind.
> 
> Best regards,
> Christophe
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 



More information about the ffmpeg-devel mailing list