[FFmpeg-devel] [PATCH 3/3] x86/vp9lpf: use fewer instructions in SPLATB_MIX
James Almer
jamrial at gmail.com
Mon Aug 4 18:17:28 CEST 2014
On 04/08/14 10:27 AM, Ronald S. Bultje wrote:
> Hi,
>
>
> On Sun, Aug 3, 2014 at 10:53 PM, James Almer <jamrial at gmail.com> wrote:
>
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>> libavcodec/x86/vp9lpf.asm | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/libavcodec/x86/vp9lpf.asm b/libavcodec/x86/vp9lpf.asm
>> index c5db0ca..def7d5a 100644
>> --- a/libavcodec/x86/vp9lpf.asm
>> +++ b/libavcodec/x86/vp9lpf.asm
>> @@ -302,9 +302,8 @@ SECTION .text
>> pshufb %1, %2
>> %else
>> punpcklbw %1, %1
>> - punpcklqdq %1, %1
>> - pshuflw %1, %1, 0
>> - pshufhw %1, %1, 0x55
>> + punpcklwd %1, %1
>> + punpckldq %1, %1
>
>
> Doesn't this miss the upper half of the register?
>
> Ronald
Using the example above the macro
..............AB (start value)
punpcklbw
............AABB
punpcklwd
........AAAABBBB
punpckldq
AAAAAAAABBBBBBBB
More information about the ffmpeg-devel
mailing list