[Ffmpeg-devel] [PATCH] SSE counterpart of ff_imdct_calc_3dn2

Guillaume POIRIER poirierg
Thu Aug 24 09:59:37 CEST 2006


Hi,

On 8/24/06, Rich Felker <dalias at aerifal.cx> wrote:
> On Thu, Aug 24, 2006 at 09:53:05AM +0800, Zuxy Meng wrote:
> > Hi,
> >
> > 2006/8/23, Michael Niedermayer <michaelni at gmx.at>:
> > >Hi
> > >
> > >ive no objections to the patch (i didnt had any to the earlier patch
> > >either)
> > >i just still think the loops would be better in asm then for(){}
> >
> > I still insist that intrinsics help produce better code, at least on gcc4.
>
> And I still insist that this statement is fundamentally false. Better
> than what? Whatever code gcc generates with the intrinsics, you can
> always generate the same or better code if you just write it yourself.

Yes, you can write better code, provided that you spend some time at
it, and know what you are doing (as in: have some experience in
writing asm code).


> Intrinsics are also gcc4-specific

False, They existed in 3.4 and I think in 3.3 also (I don't know about
earlier releases, but for sure 2.95 do not support them).

Also, ICC is able to process these intrinsics, whereas it has a hard
time with inline asm.


> and have the problem that
> performance is subject to the whims (and bugs) of gcc, whose record is
> very bad...

True that.

Rich, you should really consider that some ppl aren't willing to spend
their youth on writting killer hand tuned asm code.
  If writing killer asm takes let's say 3 hours, provide a 4x speed-up
and run only in one arch (say: ia32), when the instinsics code just
takes 1 hour, provide a 3.5x speed-up (and rising if the compiler
improves), and work on all x86 machines (32 and 64 bits), I'd say
there are perfectly good reasons to write with intrinsics.
It really depends on the goal of the code writer.

Guillaume

PS: yes, I totally made up the above figures
-- 
A thing is not necessarily true because a man dies for it.
-- Oscar Wilde




More information about the ffmpeg-devel mailing list