[FFmpeg-devel] [PATCH 2/3] lavc/cbrt_tablegen: speed up tablegen slightly
Ganesh Ajjanagadde
gajjanagadde at gmail.com
Thu Dec 31 17:39:21 CET 2015
This exploits a very simple property of the cbrt function, obtaining a
non-negligible speed-up. Tables turn out to be identical on
GNU/Linux+gcc.
Sample benchmark (Haswell, GNU/Linux+gcc):
new:
6632898 decicycles in cbrt_tableinit, 256 runs, 0 skips
6623909 decicycles in cbrt_tableinit, 512 runs, 0 skips
prev:
7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips
7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips
i.e very close to the estimated 12.5% speedup.
Tested with FATE.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
---
libavcodec/cbrt_tablegen.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h
index ef4c099..d3614d8 100644
--- a/libavcodec/cbrt_tablegen.h
+++ b/libavcodec/cbrt_tablegen.h
@@ -43,9 +43,13 @@ static union av_intfloat32 cbrt_tab[1 << 13];
static av_cold void AAC_RENAME(cbrt_tableinit)(void)
{
if (!cbrt_tab[(1<<13) - 1].i) {
+ cbrt_tab[0].f = 0;
int i;
for (i = 0; i < 1<<13; i++) {
- cbrt_tab[i].f = i * cbrt(i);
+ if (!(i & 7))
+ cbrt_tab[i].f = 16 * cbrt_tab[i>>3].f;
+ else
+ cbrt_tab[i].f = i * cbrt(i);
}
#if USE_FIXED
for (i = 0; i < 1<<13; i++) {
--
2.6.4
More information about the ffmpeg-devel
mailing list