[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs

Guillaume POIRIER poirierg
Sun May 27 14:52:18 CEST 2007


Hi,

On 5/27/07, Guillaume POIRIER <poirierg at gmail.com> wrote:
> Hi,
>
> On 5/26/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > Hi,
> >
> > 2007/5/26, Tyler Loch <tylerl82 at mac.com>:
> > > r9017's /libavcodec/i386/fft_sse.c updates cause garbled output from
> > > WMA audio on Intel Mac OS X (Core Duo CPUs).
> > >
> > > Reverting to r6577's fft_sse.c decodes WMA audio perfectly.
>
> I can confirm this on OSX with all vorbis samples I have. On
> Linux/AMD64, there's no problem (no idea if 32-bits mode is OK or
> not).
>
> > Would u provide me a sample so I can investigate it?
>
> Any vorbis should do the trick. I'll try to narrow down the problem to
> see which part of the patch broke it.

This hunk is what causes the regression:

Index: fft_sse.c
===================================================================
--- fft_sse.c	(revision 9017)
+++ fft_sse.c	(revision 6577)
@@ -100,33 +100,20 @@
             i = nloops*8;
             asm volatile(
                 "1: \n\t"
-                "sub $32, %0 \n\t"
+                "sub $16, %0 \n\t"
                 "movaps    (%2,%0), %%xmm1 \n\t"
                 "movaps    (%1,%0), %%xmm0 \n\t"
-                "movaps  16(%2,%0), %%xmm5 \n\t"
-                "movaps  16(%1,%0), %%xmm4 \n\t"
                 "movaps     %%xmm1, %%xmm2 \n\t"
-                "movaps     %%xmm5, %%xmm6 \n\t"
                 "shufps      $0xA0, %%xmm1, %%xmm1 \n\t"
                 "shufps      $0xF5, %%xmm2, %%xmm2 \n\t"
-                "shufps      $0xA0, %%xmm5, %%xmm5 \n\t"
-                "shufps      $0xF5, %%xmm6, %%xmm6 \n\t"
                 "mulps   (%3,%0,2), %%xmm1 \n\t" //  cre*re cim*re
                 "mulps 16(%3,%0,2), %%xmm2 \n\t" // -cim*im cre*im
-                "mulps 32(%3,%0,2), %%xmm5 \n\t" //  cre*re cim*re
-                "mulps 48(%3,%0,2), %%xmm6 \n\t" // -cim*im cre*im
                 "addps      %%xmm2, %%xmm1 \n\t"
-                "addps      %%xmm6, %%xmm5 \n\t"
                 "movaps     %%xmm0, %%xmm3 \n\t"
-                "movaps     %%xmm4, %%xmm7 \n\t"
                 "addps      %%xmm1, %%xmm0 \n\t"
                 "subps      %%xmm1, %%xmm3 \n\t"
-                "addps      %%xmm5, %%xmm4 \n\t"
-                "subps      %%xmm5, %%xmm7 \n\t"
                 "movaps     %%xmm0, (%1,%0) \n\t"
                 "movaps     %%xmm3, (%2,%0) \n\t"
-                "movaps     %%xmm4, 16(%1,%0) \n\t"
-                "movaps     %%xmm7, 16(%2,%0) \n\t"
                 "jg 1b \n\t"
                 :"+r"(i)
                 :"r"(p), "r"(p + nloops), "r"(cptr)


We're quite lucky, it's the shortest of the 2 hunks.

Now I need to figure out what's wrong in that hunk.

Guillaume
-- 
Y'a pas de gonzesse hooligan,
Imb?cile et meurtri?re
Y'en a pas m?me en grande Bretagne
A part bien s?r Madame Thatcher
  -- Renaud (sur "Miss Maggie")



More information about the ffmpeg-devel mailing list