[Ffmpeg-devel] int vs. float profiler, take 2

Michael Niedermayer michaelni
Sat May 21 17:11:22 CEST 2005


Hi

On Saturday 21 May 2005 15:59, Attila Kinali wrote:
> On Fri, 20 May 2005 17:05:30 -0600
>
> Mike Melanson <mike at multimedia.cx> wrote:
> > The ASM code has ITERATIONS set to 1 right now. I would be interested to
> > know the results from varying CPUs using 1, 10, and 100 iterations.
>
> Here again my PentiumM:
>
>   warming up with 1000 cycles...
> integer_adder(), 1 adds, 43 cycles used (overhead = 43)
> float_adder(), 1 adds, 199 cycles used (overhead = 43)
> integer_mult(), 1 mults, 43 cycles used (overhead = 43)
> float_mult(), 1 mults, 200 cycles used (overhead = 43)
>
>   warming up with 1000 cycles...
> integer_adder(), 10 adds, 46 cycles used (overhead = 43)
> float_adder(), 10 adds, 1596 cycles used (overhead = 43)
> integer_mult(), 10 mults, 87 cycles used (overhead = 43)
> float_mult(), 10 mults, 1618 cycles used (overhead = 43)
>
>  warming up with 1000 cycles...
> integer_adder(), 100 adds, 136 cycles used (overhead = 43)
> float_adder(), 100 adds, 15579 cycles used (overhead = 43)
> integer_mult(), 100 mults, 539 cycles used (overhead = 43)
> float_mult(), 100 mults, 15780 cycles used (overhead = 43)
>
> And here the result with Gabriel Gerhardsson math.asm:
>
>   warming up with 1000 cycles...
> integer_adder(), 1 adds, 291 cycles used (overhead = 291)
> float_adder(), 1 adds, 447 cycles used (overhead = 290)
> integer_mult(), 1 mults, 295 cycles used (overhead = 291)
> float_mult(), 1 mults, 448 cycles used (overhead = 291)
>
>   warming up with 1000 cycles...
> integer_adder(), 10 adds, 301 cycles used (overhead = 291)
> float_adder(), 10 adds, 1844 cycles used (overhead = 291)
> integer_mult(), 10 mults, 340 cycles used (overhead = 290)
> float_mult(), 10 mults, 1865 cycles used (overhead = 290)
>
>   warming up with 1000 cycles...
> integer_adder(), 100 adds, 397 cycles used (overhead = 291)
> float_adder(), 100 adds, 15827 cycles used (overhead = 290)
> integer_mult(), 100 mults, 792 cycles used (overhead = 290)
> float_mult(), 100 mults, 16027 cycles used (overhead = 291)
>
> Any explanation why float performs so much worse than integer
> compared to the first version of this benchmark ?

yes, the code is buggy not that its too meaningfull if it wherent
ahh the bug, fpu stack overflow it is

original:
  warming up with 1000 cycles...
integer_adder(), 100 adds, 130 cycles used (overhead = 32)
float_adder(), 100 adds, 11558 cycles used (overhead = 32)
integer_mult(), 100 mults, 556 cycles used (overhead = 32)
float_mult(), 100 mults, 11745 cycles used (overhead = 32)

with a few fstp
  warming up with 1000 cycles...
integer_adder(), 100 adds, 130 cycles used (overhead = 32)
float_adder(), 100 adds, 328 cycles used (overhead = 32)
integer_mult(), 100 mults, 529 cycles used (overhead = 32)
float_mult(), 100 mults, 527 cycles used (overhead = 32)

[...]
-- 
Michael





More information about the ffmpeg-devel mailing list