[FFmpeg-devel] [PATCH] x86/dsputilenc: optimize sum_abs_dctelem functions
James Almer
jamrial at gmail.com
Sun May 25 08:34:04 CEST 2014
On 25/05/14 3:18 AM, Christophe Gisquet wrote:
> Hi,
>
>> I originally tested this on an SSE2 only machine, so i didn't see the hit
>> on the SSSE3 version. Sorry about that.
>
> On the other hand, the difference is within measure noise, I expect, and is
> below most actual effects. Anything below 0.5 cycles seems inconsequential
> to me.
>
True, but it's still a performance hit with no gain of any kind to go alongside it.
Now, if making the macro uglier is acceptable, i think making this change only for
the non SSSE3 versions of the function would be nice. Especially the SSE2 one, which
will stop clobbering xmm6 on Win64.
> But nevermind.
>
> Best regards,
> Christophe
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list