[FFmpeg-devel] [PATCH] Better AV_ZERO128 on x86

Jason Garrett-Glaser jason
Sat Jan 15 21:40:46 CET 2011


On Sat, Jan 15, 2011 at 12:28 PM, Alexander Strange
<astrange at ithinksw.com> wrote:
> On Sat, Jan 15, 2011 at 2:53 AM, Jason Garrett-Glaser <jason at x264.com> wrote:
>> This patch makes AV_ZERO work under SSE, instead of requiring SSE2.
>
> Fine here.
>
> The 'struct v's defined in that file need av_alias added sometime, or
> need to be moved next to the definition of av_alias64.
>
>>
>> But AV_ZERO still sucks, because it duplicates the pxor/xorps every
>> single time it runs. ?Is there any reason we can't do what x264 does,
>> that is, use intrinsics like ((__m128){0,0,0,0}) to solve the problem?
>> ?This does make gcc (even crappy old gcc) reuse the value instead of
>> calculating it every time.
>
> I only did it like this to avoid an argument about using intrinsics.
>
> Isn't it actually faster on some cpus to do xor every time, because of
> register read stalls?

Not enough for it to matter in reality...

Jason



More information about the ffmpeg-devel mailing list