[FFmpeg-devel] [PATCH v2] lavc/aarch64/fdct: add neon-optimized fdct for aarch64
Martin Storsjö
martin at martin.st
Wed Apr 17 11:19:49 EEST 2024
On Wed, 17 Apr 2024, Ramiro Polla wrote:
> The code is imported from libjpeg-turbo-3.0.1. The neon registers used
> have been changed to avoid modifying v8-v15.
> ---
> libavcodec/aarch64/Makefile | 2 +
> libavcodec/aarch64/fdct.h | 26 ++
> libavcodec/aarch64/fdctdsp_init_aarch64.c | 39 +++
> libavcodec/aarch64/fdctdsp_neon.S | 368 ++++++++++++++++++++++
> libavcodec/avcodec.h | 1 +
> libavcodec/fdctdsp.c | 4 +-
> libavcodec/fdctdsp.h | 2 +
> libavcodec/options_table.h | 1 +
> libavcodec/tests/aarch64/dct.c | 2 +
> tests/checkasm/Makefile | 1 +
> tests/checkasm/checkasm.c | 3 +
> tests/checkasm/checkasm.h | 1 +
> tests/checkasm/fdctdsp.c | 68 ++++
> tests/fate/checkasm.mak | 1 +
> 14 files changed, 518 insertions(+), 1 deletion(-)
> create mode 100644 libavcodec/aarch64/fdct.h
> create mode 100644 libavcodec/aarch64/fdctdsp_init_aarch64.c
> create mode 100644 libavcodec/aarch64/fdctdsp_neon.S
> create mode 100644 tests/checkasm/fdctdsp.c
Overall LGTM, thanks!
You may wish to split adding the checkasm test to a separate patch,
before adding the new implementation.
I was surprised by the header libavcodec/aarch64/fdct.h which seemed
redundant on first glance, but I see that this is needed for the dct test
executable in libavcodec/tests/aarch64/dct.c, so I guess this is
reasonable. (In most other asm implementations, we just declare the
functions at the start of the *_init.c files.)
// Martin
More information about the ffmpeg-devel
mailing list