[FFmpeg-devel] [PATCH] ff_h[yc]scale_fast_mmext always clobbers %rbx
Nick Lewycky
nlewycky at google.com
Thu May 7 01:08:09 CEST 2015
On 6 May 2015 at 15:06, Michael Niedermayer <michaelni at gmx.at> wrote:
> Hi
>
> On Wed, May 06, 2015 at 11:52:59AM -0700, Nick Lewycky wrote:
> > Newer versions of clang will allocate %rbx as one of the inline asm
> inputs,
>
> Thats great
>
>
> > even in PIC. This occurs when building ffmpeg with clang
> -fsanitize=address
> > -O1 -fPIE. Because the asm does clobber %bx whether PIC is on or off,
> just
> > include %bx in the clobber list regardless of whether PIC is on or off.
>
> you cant include *bx in the clobber list with PIC
> it breaks compilers that are less great, like gcc
>
Doh!! I did check for this, but only tried x86-64, not x86-32. Sorry!
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
> ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> ‘ff_hyscale_fast_mmxext’:
> ../libswscale/x86/hscale_fast_bilinear_simd.c:205:5: error: PIC register
> clobbered by ‘%ebx’ in ‘asm’
> ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> ‘ff_hcscale_fast_mmxext’:
> ../libswscale/x86/hscale_fast_bilinear_simd.c:276:5: error: PIC register
> clobbered by ‘%ebx’ in ‘asm’
>
>
> also what exactly are you trying to fix ?
> or rather what exactly goes how exactly wrong with the code as it is
> if rbx is used ?
>
Ok, let's look at ff_hcscale_fast_mmext. Preprocessor directives evaluated
in PIC x86-64, the inline constraints work out to:
:: "m" (src1), "m" (dst1), "m" (filter), "m" (filterPos),
"m" (mmxextFilterCode), "m" (src2), "m"(dst2)
,"m" (ebxsave)
,"m"(retsave)
: "%"REG_a, "%"REG_c, "%"REG_d, "%"REG_S, "%"REG_D
so clang looks at that and decides that it can pick src1 = (%r10), dst1 =
(%r8), filter = (%r11), filterPos = (%r12), mmxextFilterCode = (%r15), src2
= (%rbx), dst2 = (%r14), ebxsave = (%r13), retsave = (%r9). The problem
there is src2 being (%rbx).
Now let's look at how we use them:
"mov %0, %%"REG_c" \n\t"
"mov %1, %%"REG_D" \n\t"
"mov %2, %%"REG_d" \n\t"
"mov %3, %%"REG_b" \n\t" // Clobbers %rbx / src2
/ %5 here
"xor %%"REG_a", %%"REG_a" \n\t"
PREFETCH" (%%"REG_c") \n\t"
PREFETCH" 32(%%"REG_c") \n\t"
PREFETCH" 64(%%"REG_c") \n\t"
CALL_MMXEXT_FILTER_CODE
CALL_MMXEXT_FILTER_CODE
CALL_MMXEXT_FILTER_CODE
CALL_MMXEXT_FILTER_CODE
"xor %%"REG_a", %%"REG_a" \n\t"
"mov %5, %%"REG_c" \n\t" // %5 is read too late,
we get %3 / filterPos instead
"mov %6, %%"REG_D" \n\t"
PREFETCH" (%%"REG_c") \n\t"
PREFETCH" 32(%%"REG_c") \n\t"
PREFETCH" 64(%%"REG_c") \n\t"
CALL_MMXEXT_FILTER_CODE // Crash...
The two marked instructions are intended to refer to different contents.
CALL_MMXEXT_FILTER_CODE moves RBX to ESI and calls mmxextFilterCode:
"movl (%%"REG_b"), %%esi \n\t"\
"call *%4 \n\t"\ // Crash...
That callee ultimately segfaults:
=> 0x7fffefa45000: movq (%rdx,%rax,1),%mm3
=> 0x7fffefa45004: movd (%rcx,%rsi,1),%mm0
Program received signal SIGSEGV, Segmentation fault.
0x00007fffefa45004 in ?? ()
=> 0x7fffefa45004: movd (%rcx,%rsi,1),%mm0
I'm trying to fix this segfault.
also ideally changes to this code should be tested against gcc/clang/icc
> of various versions with and without PIC, 32 and 64 bit
> this code is more tricky than than the average so theres a good
> change changes to it will break some compiler
>
I understand that, but I may not be able to test as many environments as
you like. I'm going to give testing it my best effort, but I can tell you
up front that I'm only going to test gcc and clang, on an x86 Ubuntu
machine. I don't have ICC, so I can't test that.
More information about the ffmpeg-devel
mailing list