[FFmpeg-devel] [PATCH 2/3] lavc/cbrt_tablegen: speed up tablegen slightly

Ganesh Ajjanagadde gajjanagadde at gmail.com
Thu Dec 31 17:39:21 CET 2015


This exploits a very simple property of the cbrt function, obtaining a
non-negligible speed-up. Tables turn out to be identical on
GNU/Linux+gcc.

Sample benchmark (Haswell, GNU/Linux+gcc):
new:
6632898 decicycles in cbrt_tableinit,     256 runs,      0 skips
6623909 decicycles in cbrt_tableinit,     512 runs,      0 skips

prev:
7582339 decicycles in cbrt_tableinit,     256 runs,      0 skips
7563556 decicycles in cbrt_tableinit,     512 runs,      0 skips

i.e very close to the estimated 12.5% speedup.

Tested with FATE.

Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
---
 libavcodec/cbrt_tablegen.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h
index ef4c099..d3614d8 100644
--- a/libavcodec/cbrt_tablegen.h
+++ b/libavcodec/cbrt_tablegen.h
@@ -43,9 +43,13 @@ static union av_intfloat32 cbrt_tab[1 << 13];
 static av_cold void AAC_RENAME(cbrt_tableinit)(void)
 {
     if (!cbrt_tab[(1<<13) - 1].i) {
+        cbrt_tab[0].f = 0;
         int i;
         for (i = 0; i < 1<<13; i++) {
-            cbrt_tab[i].f = i * cbrt(i);
+            if (!(i & 7))
+                cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f;
+            else
+                cbrt_tab[i].f = i * cbrt(i);
         }
 #if USE_FIXED
         for (i = 0; i < 1<<13; i++) {
-- 
2.6.4



More information about the ffmpeg-devel mailing list