[FFmpeg-devel] [PATCH 0/6] truehd: ARM optimisations
bavison at riscosopen.org
Wed Mar 19 18:26:15 CET 2014
I present here a patch series aimed at improving the performance of Dolby
TrueHD audio decoding on ARM CPUs, with a particular focus on the ARM1176JZF-S
as featured in the Raspberry Pi. To date, only one function had been
optimised, and that was only for x86.
For each optimisation, I am including benchmarks for two streams:
hd_thx_tex_moo_can_lossless (5.1 channels with a 2.0 channel substream) and
hd_dolby_spheres_lossless_v1 (7.1 channels with 5.1 and 2.0 channel
substreams). As I discovered during my testing, the biggest thing you can do
to reduce the CPU usage is to set AVCodecContext's request_channel_layout
member to a value that specifies a simple channel configuration, such that
only the first substream of TrueHD is decoded. However, I recognise that not
everyone will want to do this, so I have benchmarked both streams both in the
2.0 downmix mode (indicated by 6:2 or 8:2) as well as a full channel decode
(indicated by 6:6 or 8:8).
Benchmarks are statistical sampling hit counts (generated using gperftools) so
lower numbers are better. The total audio decode time is considered to be the
number of hits within function read_access_unit() - which implements
AVCodec::decode for MLP - plus any of its callees. For each patch I also
include hits for the relevant subroutine plus its callees.
The overall benchmark results for the series are as follows:
Mean StdDev Mean StdDev Confidence Change
6:2 total 380.4 22.0 329.3 16.0 100.0% +15.5%
8:2 total 357.0 17.5 323.6 14.3 100.0% +10.3%
6:6 total 717.2 23.2 539.9 24.2 100.0% +32.9%
8:8 total 981.9 16.2 702.5 18.5 100.0% +39.8%
Ben Avison (6):
truehd: add hand-scheduled ARM asm version of mlp_filter_channel.
truehd: break out part of rematrix_channels into platform-specific
truehd: add hand-scheduled ARM asm version of
truehd: tune VLC decoding for ARM.
truehd: break out part of output_data into platform-specific
truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output.
libavcodec/arm/Makefile | 4 +
libavcodec/arm/mlpdsp_arm.S | 1164 ++++++++++++++++++++++++++++++++++++++
libavcodec/arm/mlpdsp_init_arm.c | 112 ++++
libavcodec/mlpdec.c | 90 ++--
libavcodec/mlpdsp.c | 71 +++
libavcodec/mlpdsp.h | 46 ++
6 files changed, 1442 insertions(+), 45 deletions(-)
create mode 100644 libavcodec/arm/mlpdsp_arm.S
create mode 100644 libavcodec/arm/mlpdsp_init_arm.c
More information about the ffmpeg-devel