[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs
Wed May 30 23:29:09 CEST 2007
On Wed, May 30, 2007 at 02:07:19PM +0200, Guillaume POIRIER wrote:
> On 5/30/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > 2007/5/30, Guillaume POIRIER <poirierg at gmail.com>:
> > > On 5/30/07, Trent Piepho <xyzzy at speakeasy.org> wrote:
> > > > > On Wed, 30 May 2007, Guillaume POIRIER wrote:
> > > > > On 5/29/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > > > > > These warnings comes from the assembler not the compiler about cases
> > > > > > like 16+(%esi). The FSF as treats this as equivalent to 16+0(esi) ==
> > > > > > 16(esi) (therefore the assumed 0). If the Apple as treats it
> > > > > > differently without even a warning then the result is catastrophic...
> > > > > >
> > > > > Linux:
> > > > > 1bd: 0f 28 02 movaps (%edx),%xmm0
> > > > > 1c0: 0f 28 19 movaps (%ecx),%xmm3
> > > > > 1c3: 0f 28 62 f0 movaps 0xfffffff0(%edx),%xmm4
> > > > > 1c7: 0f 28 79 10 movaps 0x10(%ecx),%xmm7
> > > > >
> > > > > 000001d7 movaps (%ebx),%xmm0
> > > > > 000001da movaps (%edi),%xmm3
> > > > > 000001dd movaps 0x00(%ebx),%xmm4
> > > > > 000001e1 movaps 0x00(%edi),%xmm7
> > > > >
> > > > > As you can clearly see, that damn OSX manage to loose the offset.
> > > > > Zuxy, do you know another syntax than the one you suggested, that
> > > > > wouldn't confuse OSX's assembler?
> > > >
> > > > Doesn't my patch fix this? That would be the alternate syntax that doesn't
> > > > confuse the assembler.
> > >
> > > Yep, your fixed patch does fix the problem (I said that earlier BTW ;-) ).
> > > Now that we know where the problem comes from, I was just wondering if
> > > there wasn't a simpler, less-invasive way. (not that your patch is
> > > unbearably longer, but based on the analysis I made of the
> > > disassembled code, it leads to more code, so I'd expect your patch to
> > > be slower (that, off course, would have to be benchmarked).
> > No it won't. Trent's patch is the correct and optimal way, giving gcc
> > more freedom in allocating general registers. I should have done this
> > in my original code but I was a bit too lazy and was concerned if too
> > many constraints would break gcc 2.95, while the fact is Trent's patch
> > compiles with gcc 2.95. So there isn't any doubt in the patch itself.
> Ok, fine with me. Michael, do you think that the patch I posted
> earlier (100% based on Trent's, only fixing minor issues) should be
well, after actually reading the code ... the loops should be written
in asm not by using for() / while() this will make the code faster
and it will make the n+%m code naturally dissapear
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The educated differ from the uneducated as much as the living from the
dead. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
More information about the ffmpeg-devel