[FFmpeg-devel] [PATCH 1/2] x86: move horizonal add macros to x86util

Henrik Gramner henrik at gramner.com
Sat Apr 12 09:58:51 CEST 2014


On Sat, Apr 12, 2014 at 1:45 AM, James Almer <jamrial at gmail.com> wrote:
> On 11/04/14 8:14 PM, Ronald S. Bultje wrote:
>> Hi
>>
>> On Fri, Apr 11, 2014 at 7:00 PM, James Almer <jamrial at gmail.com> wrote:
>>
>>> Also port relevant AVX2/XOP optimizations from x264
>>>
>>
>> Did you get permission from them to relicense to LGPL? I know it's trivial
>> code but really, but better safe than sorry.
>
> No. Since we were importing changes from x264's x86inc/util when they were useful
> I assumed it was ok.
>
> I wrote the HADDD xop optimization, but not the AVX2 and HADDW xop ones. I can
> remove those two if Henrik and Jason are against this.
> I'm CCing them in any case.
>
>>
>>> +%macro HADDD 2 ; sum junk
>>> +%if sizeof%1 == 32
>>> +%define %2 xmm%2
>>> +    vextracti128 %2, %1, 1
>>> +%define %1 xmm%1
>>> +    paddd   %1, %2
>>> +%endif
>>> +%if mmsize >= 16
>>> +%if cpuflag(xop) && sizeof%1 == 16
>>> +    vphadddq %1, %1
>>> +%endif
>>> +    movhlps %2, %1
>>> +    paddd   %1, %2
>>> +%endif
>>> +%if notcpuflag(xop)
>>> +    PSHUFLW %2, %1, q0032
>>> +    paddd   %1, %2
>>> +%endif
>>> +%undef %1
>>> +%undef %2
>>> +%endmacro
>>> +
>>> +%macro HADDW 2 ; reg, tmp
>>> +%if cpuflag(xop) && sizeof%1 == 16
>>> +    vphaddwq  %1, %1
>>> +    movhlps   %2, %1
>>> +    paddd     %1, %2
>>> +%else
>>> +    pmaddwd %1, [pw_1]
>>> +    HADDD   %1, %2
>>> +%endif
>>> +%endmacro
>>
>>
>> So, these require some comments on what they do - the naming is terrible.
>> It suggests that they act like phaddw/d, but they actually just act on the
>> lower half of the output register (or the full half of one, rather than
>> both, input registers). You probably want to make that explicit in a
>> command, maybe even rename just to prevent the obvious confusion.
>
> They are not supposed to behave like phaddd/w, which is why they are not called
> PHADDD/W.
>
> Not sure what kind of comment to add. And I'd rather not rename them. I don't
> want to deviate too much from x264's x86util unless necessary.
>
>>
>> Ronald
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>

Relicensing this as LGPL is fine with me.

Henrik


More information about the ffmpeg-devel mailing list