[FFmpeg-devel] [aarch64] improve hscale by 50% with multi-threading
Michael Niedermayer
michael at niedermayer.cc
Sat Jul 18 09:35:25 EEST 2020
On Fri, Jul 17, 2020 at 11:08:02PM -0500, Sebastian Pop wrote:
> hscale is bound by the number of multiply-adds available on a given core.
> The attached patch doubles the number of multiply-adds by distributing half
> the load to a helper thread.
>
> The performance improves up to 50% on Graviton2 Arm Neoverse-N1 processors.
>
> $ ./ffmpeg_g -nostats -f lavfi -i testsrc2=4k:d=2 -vf
> bench=start,scale=1024x1024,bench=stop -f null -
> before: [bench @ 0xaaaad62c3d30] t:0.013293 avg:0.013315 max:0.013697
> min:0.013293
> after: [bench @ 0xaaaae9346d30] t:0.009637 avg:0.009691 max:0.010005
> min:0.009637
> 38% improvement
>
> scale=1280x720 49% improvement
> before: [bench @ 0xaaaadba88d30] t:0.015973 avg:0.016321 max:0.016917
> min:0.015973
> after: [bench @ 0xaaaabc78dd30] t:0.010823 avg:0.010869 max:0.011552
> min:0.010708
>
> scale=852x480 45% improvement
> before: [bench @ 0xaaaaeeed0d30] t:0.013731 avg:0.013727 max:0.013773
> min:0.013279
> after: [bench @ 0xaaaaf5f5dd30] t:0.009279 avg:0.009296 max:0.009328
> min:0.009187
>
> scale=640x360 45% improvement
> before: [bench @ 0xaaaacee25d30] t:0.012010 avg:0.012006 max:0.012053
> min:0.011653
> after: [bench @ 0xaaaaea2b5d30] t:0.008077 avg:0.008084 max:0.008409
> min:0.008057
>
> scale=284x160 36% improvement
> before: [bench @ 0xaaaadbb9ed30] t:0.008384 avg:0.008367 max:0.008421
> min:0.008193
> after: [bench @ 0xaaaafb1d6d30] t:0.006099 avg:0.006100 max:0.006120
> min:0.006026
> aarch64/swscale.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
> swscale_internal.h | 15 +++++++++++++++
> utils.c | 14 ++++++++++++++
> 3 files changed, 72 insertions(+), 1 deletion(-)
> 9a65bd72cd0a37e73a554e568b34f9d6bb27cb58 0001-aarch64-improve-hscale-by-50-with-multi-threading.patch
> From 3321950c109b416e63eda59c76e6365abc2072b8 Mon Sep 17 00:00:00 2001
> From: Sebastian Pop <spop at amazon.com>
> Date: Thu, 2 Jul 2020 16:57:58 +0000
> Subject: [PATCH] [aarch64] improve hscale by 50% with multi-threading
Multithreading support should be added in a architecture independant way
Thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
During times of universal deceit, telling the truth becomes a
revolutionary act. -- George Orwell
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200718/6dc07329/attachment.sig>
More information about the ffmpeg-devel
mailing list