[Ffmpeg-devel] [PATCH] SSE counterpart of ff_imdct_calc_3dn2

Michael Niedermayer michaelni
Thu Aug 24 12:23:42 CEST 2006


On Thu, Aug 24, 2006 at 06:39:52AM +0200, Luca Barbato wrote:
> Rich Felker wrote:
> > 
> > And I still insist that this statement is fundamentally false. Better
> > than what? Whatever code gcc generates with the intrinsics, you can
> > always generate the same or better code if you just write it yourself.
> I don't know how slow each possible op is for each cpu, gcc should for
> most of the documented ones....

gcc does not know it (exactly) either, as
1. the docs have the tendency to be somewhat "over optimistic" in the values
   IIRC there are some cases where the claimed througput of some instructions
   cannot be achieved as some stages of the pipeline simply cant handle it
   so gcc devels IMO should benchmark the instructions themselfs not
   blindly belive what the docs say
2. "modern" cpus are complex and considering every part is not possible, its
   not possible because you dont know how long a read or write will need
   not know if a branch will be predicted or not and not know at what
   address various things will be, yes that matters, put 3 variables exactly
   4096 byte appart and access them in a loop you will have 100% cache misses
   on several cpus, and if not try 5 such variables or try a larger power of
   2 spacing, these are all things the author will likely know more about
   then gcc as she wrote the code and understands it gcc doesnt
3. different "revisions" of cpus need different amounts of time to execute 
   stuff, the various P4s for example, gcc does not know about that though
4. the docs only mention some information, theres alot missing for example
   no doc ive seen from amd/intel explained how the cpus reorder instructions

furthermore if you dont know how fast approximately each op is then your
code will suck and it wont make a difference if gcc reorders things or not

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is

More information about the ffmpeg-devel mailing list