[FFmpeg-devel] [PATCH] x86/dsputilenc: optimize sum_abs_dctelem functions
jamrial at gmail.com
Sun May 25 08:34:04 CEST 2014
On 25/05/14 3:18 AM, Christophe Gisquet wrote:
>> I originally tested this on an SSE2 only machine, so i didn't see the hit
>> on the SSSE3 version. Sorry about that.
> On the other hand, the difference is within measure noise, I expect, and is
> below most actual effects. Anything below 0.5 cycles seems inconsequential
> to me.
True, but it's still a performance hit with no gain of any kind to go alongside it.
Now, if making the macro uglier is acceptable, i think making this change only for
the non SSSE3 versions of the function would be nice. Especially the SSE2 one, which
will stop clobbering xmm6 on Win64.
> But nevermind.
> Best regards,
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
More information about the ffmpeg-devel