[FFmpeg-devel] [PATCH] SSE2 and SSSE3 versions of h264 biweight prediction code (biweight_h264_pixels_tab)

Eli Friedman eli.friedman
Sat Jul 31 04:47:12 CEST 2010


On Thu, Jul 29, 2010 at 11:15 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Thu, Jul 29, 2010 at 9:23 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> Hi,
>>
>> On Thu, Jul 29, 2010 at 12:32 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> Patch attached. ?Loosely based off of the MMX2 version. ?Around 1%
>>> faster overall on a test file on my Mobile Core i5.
>> [..]
>>> +cglobal h264_biweight_8x8_ssse3, 7, 7, 8
>>> + ? ?BIWEIGHT_SSSE3_SETUP
>>> + ? ?mov ? ? ? ?r3, 4
>>> +
>>> +.nextrow
>>> + ? ?BIWEIGHT_SSSE3_OP r2
>>> + ? ?movh ? ? ? [r0], m0
>>> + ? ?movhps ? ? [r0+r2], m0
>>> + ? ?lea ? ? ? ?r0, [r0+r2*2]
>>> + ? ?lea ? ? ? ?r1, [r1+r2*2]
>>> + ? ?dec ? ? ? ?r3
>>> + ? ?jnz .nextrow
>>> + ? ?REP_RET
>>
>> You have several unused r%d regs here, maybe you want to use lea r4,
>> [r2*2] and then use add r0/r1, r4 instead of lea, that should result
>> in slightly smaller code. Same for h264_biweight_8x8_sse2.
>
> Will do.

Done in attached.

>>> +%macro BIWEIGHT_SSSE3_OP 1
>>> + ? ?movh ? ? ? m0, [r0]
>>> + ? ?movh ? ? ? m1, [r1]
>>> + ? ?movh ? ? ? m2, [r0+%1]
>>> + ? ?movh ? ? ? m3, [r1+%1]
>>> + ? ?punpcklbw ?m0, m1
>>> + ? ?punpcklbw ?m2, m3
>>
>> If you don't use m1/m3 afterwards, you can IIRC just punpcklbw m0,
>> [r0+%1] and same for the line below.
>
> I don't have appropriate alignment for the 8x8 case, but I suppose I
> can do it in the 16x16 case.

Done in attached; saves one instruction per 16 pixels in the 16x16
case.  (I could fiddle with it to remove another load, but I doubt the
speed would be significantly different.)

-Eli
-------------- next part --------------
A non-text attachment was scrubbed...
Name: biweight.patch
Type: text/x-patch
Size: 7550 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100730/9e554555/attachment.bin>



More information about the ffmpeg-devel mailing list