[FFmpeg-devel] [PATCH] lavc/cook: get rid of wasteful pow in init_pow2table
Ganesh Ajjanagadde
gajjanag at mit.edu
Tue Dec 29 22:12:49 CET 2015
On Tue, Dec 29, 2015 at 11:29 AM, Clément Bœsch <u at pkh.me> wrote:
> On Tue, Dec 29, 2015 at 09:28:34AM -0800, Ganesh Ajjanagadde wrote:
>> The table is highly structured, so pow (or exp2 for that matter) can entirely
>> be avoided, yielding a ~ 40x speedup with no loss of accuracy.
>>
>> sample benchmark (Haswell, GNU/Linux):
>> new:
>> 4449 decicycles in init_pow2table(loop 1000), 254 runs, 2 skips
>> 4411 decicycles in init_pow2table(loop 1000), 510 runs, 2 skips
>> 4391 decicycles in init_pow2table(loop 1000), 1022 runs, 2 skips
>>
>> old:
>> 183673 decicycles in init_pow2table(loop 1000), 256 runs, 0 skips
>> 182142 decicycles in init_pow2table(loop 1000), 512 runs, 0 skips
>> 182104 decicycles in init_pow2table(loop 1000), 1024 runs, 0 skips
>>
>> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> ---
>> libavcodec/cook.c | 11 +++++++++--
>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/cook.c b/libavcodec/cook.c
>> index d8fb736..aa434a2 100644
>> --- a/libavcodec/cook.c
>> +++ b/libavcodec/cook.c
>> @@ -166,10 +166,17 @@ static float rootpow2tab[127];
>> /* table generator */
>> static av_cold void init_pow2table(void)
>> {
>> + /* fast way of computing 2^i and 2^(0.5*i) for -63 <= i < 64 */
>> int i;
>> + static const float exp2_tab[2] = {1, M_SQRT2};
>
>> + float exp2_val = 1.0842021724855044e-19; /* 2^(-63) */
>> + float root_val = 2.3283064365386963e-10; /* 2^(-32) */
>
> I'm pretty sure you can do
> float exp2_val = pow(2, -63);
> float root_val = pow(2, -32);
> and compilers will inline them
Any decent compiler would.
Alternatively, if we had hexadecimal floating point literals (%a) on
all platforms, a C99 feature, it would look quite clean. Hexadecimal
floating point literals are also nice as they are bit-exact
representations of the underlying float, unlike decimal constants
where one needs to reason about how many digits one needs. I believe
"%.17g works for IEEE-754 doubles, note that for instance %.17lf does
not on very small inputs. Unfortunately, MSVC lacks hexadecimal
floating literals.
I really don't mind either way, and since you prefer pow(2,-63), I
have changed locally.
>
> [...]
>
> --
> Clément B.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list