[FFmpeg-devel] [PATCH] ff_h[yc]scale_fast_mmext always clobbers %rbx

Thu May 7 01:08:09 CEST 2015

On 6 May 2015 at 15:06, Michael Niedermayer <michaelni at gmx.at> wrote:

> Hi
>
> On Wed, May 06, 2015 at 11:52:59AM -0700, Nick Lewycky wrote:
> > Newer versions of clang will allocate %rbx as one of the inline asm
> inputs,
>
> Thats great
>
>
> > even in PIC. This occurs when building ffmpeg with clang
> -fsanitize=address
> > -O1 -fPIE. Because the asm does clobber %bx whether PIC is on or off,
> just
> > include %bx in the clobber list regardless of whether PIC is on or off.
>
> you cant include *bx in the clobber list with PIC
> it breaks compilers that are less great, like gcc
>

Doh!! I did check for this, but only tried x86-64, not x86-32. Sorry!

gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
> ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> ‘ff_hyscale_fast_mmxext’:
> ../libswscale/x86/hscale_fast_bilinear_simd.c:205:5: error: PIC register
> clobbered by ‘%ebx’ in ‘asm’
> ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> ‘ff_hcscale_fast_mmxext’:
> ../libswscale/x86/hscale_fast_bilinear_simd.c:276:5: error: PIC register
> clobbered by ‘%ebx’ in ‘asm’
>
>
> also what exactly are you trying to fix ?
> or rather what exactly goes how exactly wrong with the code as it is
> if rbx is used ?
>

Ok, let's look at ff_hcscale_fast_mmext. Preprocessor directives evaluated
in PIC x86-64, the inline constraints work out to:

        :: "m" (src1), "m" (dst1), "m" (filter), "m" (filterPos),
           "m" (mmxextFilterCode), "m" (src2), "m"(dst2)
          ,"m" (ebxsave)
          ,"m"(retsave)
        : "%"REG_a, "%"REG_c, "%"REG_d, "%"REG_S, "%"REG_D

so clang looks at that and decides that it can pick src1 = (%r10), dst1 =
(%r8), filter = (%r11), filterPos = (%r12), mmxextFilterCode = (%r15), src2
= (%rbx), dst2 = (%r14), ebxsave = (%r13), retsave = (%r9). The problem
there is src2 being (%rbx).

Now let's look at how we use them:

        "mov                 %0, %%"REG_c"  \n\t"
        "mov                 %1, %%"REG_D"  \n\t"
        "mov                 %2, %%"REG_d"  \n\t"
        "mov                 %3, %%"REG_b"  \n\t"  // Clobbers %rbx / src2
/ %5 here
        "xor          %%"REG_a", %%"REG_a"  \n\t"
        PREFETCH"   (%%"REG_c")             \n\t"
        PREFETCH" 32(%%"REG_c")             \n\t"
        PREFETCH" 64(%%"REG_c")             \n\t"

        CALL_MMXEXT_FILTER_CODE
        CALL_MMXEXT_FILTER_CODE
        CALL_MMXEXT_FILTER_CODE
        CALL_MMXEXT_FILTER_CODE
        "xor          %%"REG_a", %%"REG_a"  \n\t"
        "mov                 %5, %%"REG_c"  \n\t"  // %5 is read too late,
we get %3 / filterPos instead
        "mov                 %6, %%"REG_D"  \n\t"
        PREFETCH"   (%%"REG_c")             \n\t"
        PREFETCH" 32(%%"REG_c")             \n\t"
        PREFETCH" 64(%%"REG_c")             \n\t"

        CALL_MMXEXT_FILTER_CODE  // Crash...

The two marked instructions are intended to refer to different contents.
CALL_MMXEXT_FILTER_CODE moves RBX to ESI and calls mmxextFilterCode:

        "movl            (%%"REG_b"), %%esi     \n\t"\
        "call                    *%4            \n\t"\  // Crash...

That callee ultimately segfaults:

=> 0x7fffefa45000:      movq   (%rdx,%rax,1),%mm3
=> 0x7fffefa45004:      movd   (%rcx,%rsi,1),%mm0
Program received signal SIGSEGV, Segmentation fault.
0x00007fffefa45004 in ?? ()
=> 0x7fffefa45004:      movd   (%rcx,%rsi,1),%mm0

I'm trying to fix this segfault.

also ideally changes to this code should be tested against gcc/clang/icc
> of various versions with and without PIC, 32 and 64 bit
> this code is more tricky than than the average so theres a good
> change changes to it will break some compiler
>

I understand that, but I may not be able to test as many environments as
you like. I'm going to give testing it my best effort, but I can tell you
up front that I'm only going to test gcc and clang, on an x86 Ubuntu
machine. I don't have ICC, so I can't test that.