[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs

Zuxy Meng zuxy.meng
Sun May 27 16:21:32 CEST 2007


Hi,

2007/5/27, Guillaume Poirier <gpoirier at mplayerhq.hu>:
> Hi,
>
> On May 27, 2007, at 3:08 , Guillaume Poirier wrote:
>
> > Hi,
> >
> > Le 27 mai 07 ? 14:52, Guillaume POIRIER a ?crit :
> >
> >> On 5/27/07, Guillaume POIRIER <poirierg at gmail.com> wrote:
> >>> Any vorbis should do the trick. I'll try to narrow down the
> >>> problem to
> >>> see which part of the patch broke it.
> >>
> >> This hunk is what causes the regression:
> >
> > Off course this should read: "applying this hunk fixes the
> > regression".
> >
> >
> >> Index: fft_sse.c
> >> ===================================================================
> >> --- fft_sse.c        (revision 9017)
> >> +++ fft_sse.c        (revision 6577)
> >> @@ -100,33 +100,20 @@
> >>              i = nloops*8;
> >>              asm volatile(
> >>                  "1: \n\t"
> >> -                "sub $32, %0 \n\t"
> >> +                "sub $16, %0 \n\t"
> >>                  "movaps    (%2,%0), %%xmm1 \n\t"
> >>                  "movaps    (%1,%0), %%xmm0 \n\t"
> >> -                "movaps  16(%2,%0), %%xmm5 \n\t"
> >> -                "movaps  16(%1,%0), %%xmm4 \n\t"
> >>                  "movaps     %%xmm1, %%xmm2 \n\t"
> >> -                "movaps     %%xmm5, %%xmm6 \n\t"
> >>                  "shufps      $0xA0, %%xmm1, %%xmm1 \n\t"
> >>                  "shufps      $0xF5, %%xmm2, %%xmm2 \n\t"
> >> -                "shufps      $0xA0, %%xmm5, %%xmm5 \n\t"
> >> -                "shufps      $0xF5, %%xmm6, %%xmm6 \n\t"
> >>                  "mulps   (%3,%0,2), %%xmm1 \n\t" //  cre*re cim*re
> >>                  "mulps 16(%3,%0,2), %%xmm2 \n\t" // -cim*im cre*im
> >> -                "mulps 32(%3,%0,2), %%xmm5 \n\t" //  cre*re cim*re
> >> -                "mulps 48(%3,%0,2), %%xmm6 \n\t" // -cim*im cre*im
> >>                  "addps      %%xmm2, %%xmm1 \n\t"
> >> -                "addps      %%xmm6, %%xmm5 \n\t"
> >>                  "movaps     %%xmm0, %%xmm3 \n\t"
> >> -                "movaps     %%xmm4, %%xmm7 \n\t"
> >>                  "addps      %%xmm1, %%xmm0 \n\t"
> >>                  "subps      %%xmm1, %%xmm3 \n\t"
> >> -                "addps      %%xmm5, %%xmm4 \n\t"
> >> -                "subps      %%xmm5, %%xmm7 \n\t"
> >>                  "movaps     %%xmm0, (%1,%0) \n\t"
> >>                  "movaps     %%xmm3, (%2,%0) \n\t"
> >> -                "movaps     %%xmm4, 16(%1,%0) \n\t"
> >> -                "movaps     %%xmm7, 16(%2,%0) \n\t"
> >>                  "jg 1b \n\t"
> >>                  :"+r"(i)
> >>                  :"r"(p), "r"(p + nloops), "r"(cptr)
> >>
> >>
> >> We're quite lucky, it's the shortest of the 2 hunks.
> >>
> >> Now I need to figure out what's wrong in that hunk.
> >
> > There's nothing wrong to this hunk!
>
> Off course there's nothing wrong with this hunk. It's the other one
> that causes the regression. I wish I had turned my brain on this
> morning when I woke up.
>
>
> > It just duplicates the original code and uses "original register
> > number" + 4.
> > Why on earth would it break on OSX and not on Linux?
> >
> > Is there's some qualified guru out there who could could enlighten me
> > here?
>
> I guess I'm off to reading the other hunk to figure out what may be
> wrong with it. If I can't figure this out, then I'll have to compare
> the assembler emitted by GCC.

Then please check things like "8+%0" "-16+%1", replace constraints
from "m" to "r" and rewrite using "8(%0)" "-16(%1)". Maybe Apple's
binutils doesn't like such syntax.

-- 
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6




More information about the ffmpeg-devel mailing list