[Ffmpeg-devel] [PATCH] snow mmx + sse2 part 5
Wolfram Gloger
wmglo
Mon Apr 17 00:05:48 CEST 2006
[ff_snow_inner_add_yblock_mmx appears to be miscompiled with gcc-2.95.x]
Here is a fix for this issue. The operand in question is loaded into
a register explicitly by the asm, so there is no possible performance
loss I can see.
Regards,
Wolfram
--- ffmpeg/libavcodec/i386/snowdsp_mmx.c Sun Apr 16 14:43:37 2006
+++ ffmpeg-wg/libavcodec/i386/snowdsp_mmx.c Sun Apr 16 23:23:43 2006
@@ -688,7 +688,7 @@
"jnz 1b \n\t"\
:"+m"(dst8),"+m"(dst_array)\
:\
- "rm"((long)(src_x<<2)),"m"(obmc),"a"(block),"m"((long)b_h),"rm"((long)src_stride):\
+ "rm"((long)(src_x<<2)),"m"(obmc),"a"(block),"m"((long)b_h),"m"((long)src_stride):\
"%"REG_b"","%"REG_c"","%"REG_S"","%"REG_D"","%"REG_d"");
#define snow_inner_add_yblock_sse2_end_8\
@@ -861,7 +861,7 @@
"jnz 1b \n\t"\
:"+m"(dst8),"+m"(dst_array)\
:\
- "rm"((long)(src_x<<2)),"m"(obmc),"a"(block),"m"((long)b_h),"rm"((long)src_stride):\
+ "rm"((long)(src_x<<2)),"m"(obmc),"a"(block),"m"((long)b_h),"m"((long)src_stride):\
"%"REG_b"","%"REG_c"","%"REG_S"","%"REG_D"","%"REG_d"");
static void inner_add_yblock_bw_8_obmc_16_mmx(uint8_t *obmc, const long obmc_stride, uint8_t * * block, int b_w, long b_h,
More information about the ffmpeg-devel
mailing list