[FFmpeg-devel] [PATCH 2/2] x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2

James Almer jamrial at gmail.com
Fri Sep 6 18:49:20 EEST 2019


On 9/6/2019 12:40 PM, Paul B Mahol wrote:
> LGTM
> 
> On 9/6/19, James Almer <jamrial at gmail.com> wrote:
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>>  libavfilter/x86/vf_v360.asm | 11 ++++-------
>>  1 file changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/libavfilter/x86/vf_v360.asm b/libavfilter/x86/vf_v360.asm
>> index f49702b603..a0936eb6dc 100644
>> --- a/libavfilter/x86/vf_v360.asm
>> +++ b/libavfilter/x86/vf_v360.asm
>> @@ -130,14 +130,11 @@ cglobal remap4_8bit_line, 7, 9, 11, dst, width, src,
>> in_linesize, u, v, ker, x,
>>          pmulld          m4, m5
>>
>>          paddd           m2, m4
>> -        vextracti128   xm1, m2, 1
>> -        paddd           m1, m2
>> -        phaddd          m1, m1
>> -        phaddd          m1, m1
>> -        psrld           m1, m1, 0xe
>> -        packuswb        m1, m1
>> +        HADDD           m2, m1
>> +        psrld           m2, m2, 0xe
>> +        packuswb        m2, m2
>>
>> -        pextrb   [dstq+xq], xm1, 0
>> +        pextrb   [dstq+xq], xm2, 0
>>
>>          add   xq, 1
>>          add   yq, 32

Pushed, thanks.


More information about the ffmpeg-devel mailing list