[Ffmpeg-devel] yet another silly int vs. float benchmark
Måns Rullgård
mru
Sat May 21 20:44:56 CEST 2005
Michael Niedermayer <michaelni at gmx.at> writes:
> hmm, try to set fv[] to 0 instead of 1, maybe it overflows
Indeed. New numbers with GCC:
100 ; needed 3 cycles -> 3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed 206 cycles -> 103 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed 1804 cycles -> 902 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[0]; needed 804 cycles -> 402 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed 804 cycles -> 402 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed 261 cycles -> 52 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed 2010 cycles -> 402 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed 511 cycles -> 102 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed 511 cycles -> 102 cycles per operation
With CCC:
100 ; needed 3 cycles -> 3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed 204 cycles -> 102 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed 1801 cycles -> 900 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[0]; needed 746 cycles -> 373 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed 685 cycles -> 342 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed 259 cycles -> 51 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed 2070 cycles -> 414 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed 10 cycles -> 2 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed 10 cycles -> 2 cycles per operation
Those last numbers are bogus, the compiler optimized the entire loop
into a few stores.
Changing it a little, I get some more realistic figures:
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[0]; needed 262 cycles -> 52 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[0]; needed 2138 cycles -> 427 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[0]; needed 462 cycles -> 92 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[0]; needed 446 cycles -> 89 cycles per operation
--
M?ns Rullg?rd
mru at inprovide.com
More information about the ffmpeg-devel
mailing list