[FFmpeg-devel] [PATCH 0/6] truehd: ARM optimisations

Ben Avison bavison at riscosopen.org
Wed Mar 19 18:26:15 CET 2014

I present here a patch series aimed at improving the performance of Dolby
TrueHD audio decoding on ARM CPUs, with a particular focus on the ARM1176JZF-S
as featured in the Raspberry Pi. To date, only one function had been
optimised, and that was only for x86.

For each optimisation, I am including benchmarks for two streams:
hd_thx_tex_moo_can_lossless (5.1 channels with a 2.0 channel substream) and
hd_dolby_spheres_lossless_v1 (7.1 channels with 5.1 and 2.0 channel
substreams). As I discovered during my testing, the biggest thing you can do
to reduce the CPU usage is to set AVCodecContext's request_channel_layout
member to a value that specifies a simple channel configuration, such that
only the first substream of TrueHD is decoded. However, I recognise that not
everyone will want to do this, so I have benchmarked both streams both in the
2.0 downmix mode (indicated by 6:2 or 8:2) as well as a full channel decode
(indicated by 6:6 or 8:8).

Benchmarks are statistical sampling hit counts (generated using gperftools) so
lower numbers are better. The total audio decode time is considered to be the
number of hits within function read_access_unit() - which implements
AVCodec::decode for MLP - plus any of its callees. For each patch I also
include hits for the relevant subroutine plus its callees.

The overall benchmark results for the series are as follows:

           Before          After
           Mean   StdDev   Mean   StdDev  Confidence  Change
6:2 total  380.4  22.0     329.3  16.0    100.0%      +15.5%
8:2 total  357.0  17.5     323.6  14.3    100.0%      +10.3%
6:6 total  717.2  23.2     539.9  24.2    100.0%      +32.9%
8:8 total  981.9  16.2     702.5  18.5    100.0%      +39.8%

Ben Avison (6):
  truehd: add hand-scheduled ARM asm version of mlp_filter_channel.
  truehd: break out part of rematrix_channels into platform-specific
  truehd: add hand-scheduled ARM asm version of
  truehd: tune VLC decoding for ARM.
  truehd: break out part of output_data into platform-specific
  truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output.

 libavcodec/arm/Makefile          |    4 +
 libavcodec/arm/mlpdsp_arm.S      | 1164 ++++++++++++++++++++++++++++++++++++++
 libavcodec/arm/mlpdsp_init_arm.c |  112 ++++
 libavcodec/mlpdec.c              |   90 ++--
 libavcodec/mlpdsp.c              |   71 +++
 libavcodec/mlpdsp.h              |   46 ++
 6 files changed, 1442 insertions(+), 45 deletions(-)
 create mode 100644 libavcodec/arm/mlpdsp_arm.S
 create mode 100644 libavcodec/arm/mlpdsp_init_arm.c


More information about the ffmpeg-devel mailing list