[FFmpeg-devel] [PATCH] h264: mark xmm registers as clobbered in h264 qpel sse functions

Ramiro Polla ramiro.polla
Fri Oct 8 00:39:24 CEST 2010


On Thu, Oct 7, 2010 at 7:32 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Oct 07, 2010 at 06:26:59PM -0300, Ramiro Polla wrote:
>> $subj, fixes h264 for win64
>
>> ?h264_qpel_mmx.c | ? 21 +++++++++++++++++++++
>> ?1 file changed, 21 insertions(+)
>> f7a94ad1d2ef7ffec9ce1a219ad9c9c4efc0e8d7 ?xmm_clobbers_h264_qpel_mmx.diff
>> Index: libavcodec/x86/h264_qpel_mmx.c
>> ===================================================================
>> --- libavcodec/x86/h264_qpel_mmx.c ? ?(revision 25401)
>> +++ libavcodec/x86/h264_qpel_mmx.c ? ?(working copy)
>> @@ -677,6 +677,10 @@
>> ? ? ? ? ?: "D"((x86_reg)src2Stride), "S"((x86_reg)dstStride),\
>> ? ? ? ? ? ?"m"(ff_pw_5), "m"(ff_pw_16)\
>> ? ? ? ? ?: "memory"\
>> + ? ? ? ? ?XMM_CLOBBERS(, "%xmm0", "%xmm1", "%xmm2", "%xmm3", \
>> + ? ? ? ? ? ? ? ? ? ? ? ? "%xmm4", "%xmm5", "%xmm6", "%xmm7", \
>> + ? ? ? ? ? ? ? ? ? ? ? ? "%xmm8", "%xmm9", "%xmm10", "%xmm11", \
>> + ? ? ? ? ? ? ? ? ? ? ? ? "%xmm12", "%xmm13", "%xmm14", "%xmm15") \
>> ? ? ?);\
>> ?}
>> ?#else // ARCH_X86_64
>
>> @@ -699,6 +703,7 @@
>> ? ? ? ? ?"pxor %%xmm7, %%xmm7 ? ? ? ?\n\t"\
>> ? ? ? ? ?"movdqa %0, %%xmm6 ? ? ? ? ?\n\t"\
>> ? ? ? ? ?:: "m"(ff_pw_5)\
>> + ? ? ? ?XMM_CLOBBERS_ONLY("%xmm6", "%xmm7") \
>> ? ? ?);\
>> ? ? ?do{\
>
>
> this is wrong
> 6/7 are read after the asm

On win64 at least gcc saves/restores the registers only on
prologue/epilogue, not in between blocks...

> correct is to either merge the asm blocks or to put manual store restore code
> for the xmm registers there under appropriate ifdef

but anyways it really is wrong. I assume ronald will want to merge the
blocks =). I'll be able to look into this late next week.



More information about the ffmpeg-devel mailing list