[Ffmpeg-devel] Fixed vs. Floating Point AAC

Michael Niedermayer michaelni
Fri Mar 10 03:07:45 CET 2006


Hi

On Thu, Mar 09, 2006 at 08:33:54PM -0500, Rich Felker wrote:
> On Thu, Mar 09, 2006 at 03:36:35PM -0800, Michel Lespinasse wrote:
> > Hope this helps, though I'd be surprised if integer supporters changed
> > their mind about this. In my opinion, they've been ignoring the evidence
> > for 10 years already.
> 
> Enough of the false accusations. In 1996, the common desktop cpu was
> at best an original-pentium. 486's were even common in many settings
> at the time. On both of these, float is extremely slow compared to
> integer. Since I'm sure you won't care about the 486's I'll focus on
> the pentium; everyone knows how abysmal float performance was on 486
> anyway. On pentium, basic integer arithmetic had throughput of 1/2
> cycle (it could execute in pipeline beside another instruction) and
> latency of 1 cycle. Multiply was ~10 cycles and divide was ~30 iirc.
> Floating point had at best throughput of one whole cycle per
> operation, and latency of several cycles. Multiply and divide were
> similar to integer in latency, except that the cpu was free to perform
> other non-float tasks at the same time. In principle this could have
> given large performance gains, but it's rarely possible to interleave
> code that well without hand-written asm and without interleaving two
> unrelated pieces of code (i.e. most intensive float dsp functions
> don't have integer stuff to do at the same time). Moreover, since

IMUL on P1 needs 9 cycles (more if prefixes are needed) and its not 
pairable with anything
FMUL on P1 needs 3 cycles can pair with FXCH, and a new floating point
instruction can be started the next cycle (2cycles overlap)



> final output would always be integers, you have the additional penalty
> of conversion to integer. This can be done with bias-add hacks to
> obtain decent performance (but still not great); otherwise it will
> take at least 20-30 cycles just to store the result, iirc.
> 
> The evidence is clear that even on present-day cpus, integer
> arithmetic is often faster than floating point. 

depends on the arithmetic used, for the things done in audio codecs this
isnt true, and thats what this thread is about


> Floating point fanboys
> will cite cycle counts or isolated benchmarks, ignoring the overhead
> of converting to/from float and ignoring the gains from using
> vectorized integer operations with small operands. 

you cant use small operands for audio due to the needed precission
not to mention the lack of a 32bit mmx multiply


> I don't want to
> compare "implementation X with arithmetic switched between int and
> float".

no, you want to sit in a cave bang your head against the wall, sticking
your fingers in your ears and screaming integers are faster always ;)


> What I want to see compared is: "what is the fastest possible
> decoder you can make with integer arithmetic, versus the fastest
> possible with float arithmetic?" Precision is a non-issue as long as
> the difference cannot be detected by doubleblind testing. The example
> I cite is libvorbis vs tremor. On my K6 tremor is several times
> faster, and on Athlons it's reportedly something like 25-50% faster.
> Clearly the mp3lib example runs the opposite direction; however I've
> never seen an integer mp3 implementation that even claims to aim for
> performance.

if i get bored ill do double blind testing of the accurate vs. low
precission tremor

[...]

-- 
Michael





More information about the ffmpeg-devel mailing list