[FFmpeg-cvslog] swresample/resample: speed up build_filter for Blackman-Nuttall filter

Ganesh Ajjanagadde git at videolan.org
Fri Nov 6 03:53:33 CET 2015


ffmpeg | branch: master | Ganesh Ajjanagadde <gajjanagadde at gmail.com> | Wed Nov  4 22:02:13 2015 -0500| [c8780822bacc38a8d84c882d564b07dd152366ed] | committer: Ganesh Ajjanagadde

swresample/resample: speed up build_filter for Blackman-Nuttall filter

This uses the trigonometric double and triple angle formulae to avoid
repeated (expensive) evaluation of libc's cos().

Sample benchmark (x86-64, Haswell, GNU/Linux)
test: fate-swr-resample-dblp-44100-2626
old:
1104466600 decicycles in build_filter(loop 1000),     256 runs,      0 skips
1096765286 decicycles in build_filter(loop 1000),     512 runs,      0 skips
1070479590 decicycles in build_filter(loop 1000),    1024 runs,      0 skips

new:
588861423 decicycles in build_filter(loop 1000),     256 runs,      0 skips
591262754 decicycles in build_filter(loop 1000),     512 runs,      0 skips
577355145 decicycles in build_filter(loop 1000),    1024 runs,      0 skips

This results in small differences with the old expression:
difference (worst case on [0, 2*M_PI]), argmax 0.008:
max diff (relative): 0.000000000000157289807188
blackman_old(0.008): 0.000363951585488813192382
blackman_new(0.008): 0.000363951585488755946507

These are judged to be insignificant for the performance gain. PSNR to
reference file is unchanged up to second decimal point for instance.

Reviewed-by: Michael Niedermayer <michael at niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=c8780822bacc38a8d84c882d564b07dd152366ed
---

 libswresample/resample.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libswresample/resample.c b/libswresample/resample.c
index c881ed8..6f2ca98 100644
--- a/libswresample/resample.c
+++ b/libswresample/resample.c
@@ -73,7 +73,7 @@ static double bessel(double x){
 static int build_filter(ResampleContext *c, void *filter, double factor, int tap_count, int alloc, int phase_count, int scale,
                         int filter_type, int kaiser_beta){
     int ph, i;
-    double x, y, w;
+    double x, y, w, t;
     double *tab = av_malloc_array(tap_count+1,  sizeof(*tab));
     const int center= (tap_count-1)/2;
 
@@ -100,7 +100,8 @@ static int build_filter(ResampleContext *c, void *filter, double factor, int tap
                 break;}
             case SWR_FILTER_TYPE_BLACKMAN_NUTTALL:
                 w = 2.0*x / (factor*tap_count) + M_PI;
-                y *= 0.3635819 - 0.4891775 * cos(w) + 0.1365995 * cos(2*w) - 0.0106411 * cos(3*w);
+                t = cos(w);
+                y *= 0.3635819 - 0.4891775 * t + 0.1365995 * (2*t*t-1) - 0.0106411 * (4*t*t*t - 3*t);
                 break;
             case SWR_FILTER_TYPE_KAISER:
                 w = 2.0*x / (factor*tap_count*M_PI);



More information about the ffmpeg-cvslog mailing list