[FFmpeg-cvslog] r25597 - trunk/libavcodec/x86/h264_qpel_mmx.c
Yuriy Kaminskiy
yumkam
Fri Oct 29 02:52:23 CEST 2010
Michael Niedermayer wrote:
> On Thu, Oct 28, 2010 at 11:38:39PM +0400, Yuriy Kaminskiy wrote:
>> ramiro wrote:
>>> Author: ramiro
>>> Date: Thu Oct 28 20:22:21 2010
>>> New Revision: 25597
>>>
>>> Log:
>>> h264dsp: merge some more asm blocks
>>>
>>> Modified:
>>> trunk/libavcodec/x86/h264_qpel_mmx.c
>>>
>>> Modified: trunk/libavcodec/x86/h264_qpel_mmx.c
>>> ==============================================================================
>>> --- trunk/libavcodec/x86/h264_qpel_mmx.c Thu Oct 28 15:20:26 2010 (r25596)
>>> +++ trunk/libavcodec/x86/h264_qpel_mmx.c Thu Oct 28 20:22:21 2010 (r25597)
>>> @@ -439,12 +434,8 @@ static av_always_inline void OPNAME ## h
>>> QPEL_H264HV(%%mm5, %%mm0, %%mm1, %%mm2, %%mm3, %%mm4, 5*48)\
>>> QPEL_H264HV(%%mm0, %%mm1, %%mm2, %%mm3, %%mm4, %%mm5, 6*48)\
>>> QPEL_H264HV(%%mm1, %%mm2, %%mm3, %%mm4, %%mm5, %%mm0, 7*48)\
>>> - : "+a"(src)\
>>> - : "c"(tmp), "S"((x86_reg)srcStride), "m"(ff_pw_5), "m"(ff_pw_16)\
>>> - : "memory"\
>>> - );\
>>> - if(size==16){\
>> Size is compile-time constant, so this check was always-true, or always-false
>> before, now it is always evaluated at runtime.
>>
>>> @@ -811,13 +802,8 @@ static av_noinline void OPNAME ## h264_q
>>> QPEL_H264V_XMM(%%xmm5, %%xmm0, %%xmm1, %%xmm2, %%xmm3, %%xmm4, OP)\
>>> QPEL_H264V_XMM(%%xmm0, %%xmm1, %%xmm2, %%xmm3, %%xmm4, %%xmm5, OP)\
>>> QPEL_H264V_XMM(%%xmm1, %%xmm2, %%xmm3, %%xmm4, %%xmm5, %%xmm0, OP)\
>>> - \
>>> - : "+a"(src), "+c"(dst)\
>>> - : "S"((x86_reg)srcStride), "D"((x86_reg)dstStride), "m"(ff_pw_5), "m"(ff_pw_16)\
>>> - : "memory"\
>>> - );\
>>> - if(h==16){\
>> Same here, h is compile-time constant.
>
> why do you think so?
> the functions are marked as av_noinline
Hmm. Indeed, *this* one is marked noinline, so my comment does not apply. But
*two others* are marked always_inline.
*However*, all 3 used in noinline function, so size/h ended up being *not*
compile-time constant (moreover, if they *were* compile-time constant, this will
be compile-time error on
"cmpl $16, %4 \n\t"\
)
And I ran benchmark, r25597 version was *faster* :-)
Just in case, I *removed* these 3 "suspicios* noinline, and it was *faster* than
original variant [*on my cpu*], but *slower* than patched variant :-)
13.277s original
13.057s noinline
12.849s r25597
So, sorry for noise.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: h264_qpel_stop_some_noinline.patch
Type: text/x-diff
Size: 2014 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-cvslog/attachments/20101029/7490db29/attachment.patch>
More information about the ffmpeg-cvslog
mailing list