[FFmpeg-devel] [PATCH v3] aacenc: add SIMD optimizations for abs_pow34 and quantization

Michael Niedermayer michael at niedermayer.cc
Tue Oct 18 16:51:22 EEST 2016


On Tue, Oct 18, 2016 at 09:02:19AM +0100, Rostislav Pehlivanov wrote:
> On 17 October 2016 at 23:43, Michael Niedermayer <michael at niedermayer.cc>
> wrote:
> 
> > On Mon, Oct 17, 2016 at 10:24:48PM +0100, Rostislav Pehlivanov wrote:
> > > Should fix segfaults on x86-32
> > >
> > > Performance improvements:
> > >
> > > quant_bands:
> > > with:     681 decicycles in quant_bands, 8388453 runs,    155 skips
> > > without: 1190 decicycles in quant_bands, 8388386 runs,    222 skips
> > > Around 42% for the function
> > >
> > > Twoloop coder:
> > >
> > > abs_pow34:
> > > with/without: 7.82s/8.17s
> > > Around 4% for the entire encoder
> > >
> > > Both:
> > > with/without: 7.15s/8.17s
> > > Around 12% for the entire encoder
> > >
> > > Fast coder:
> > >
> > > abs_pow34:
> > > with/without: 3.40s/3.77s
> > > Around 10% for the entire encoder
> > >
> > > Both:
> > > with/without: 3.02s/3.77s
> > > Around 20% faster for the entire encoder
> > >
> > > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > > ---
> > >  libavcodec/aaccoder.c            | 27 +++++++------
> > >  libavcodec/aaccoder_trellis.h    |  2 +-
> > >  libavcodec/aaccoder_twoloop.h    |  2 +-
> > >  libavcodec/aacenc.c              |  4 ++
> > >  libavcodec/aacenc.h              |  6 +++
> > >  libavcodec/aacenc_is.c           |  6 +--
> > >  libavcodec/aacenc_ltp.c          |  4 +-
> > >  libavcodec/aacenc_pred.c         |  6 +--
> > >  libavcodec/aacenc_quantization.h |  4 +-
> > >  libavcodec/aacenc_utils.h        |  4 +-
> > >  libavcodec/x86/Makefile          |  2 +
> > >  libavcodec/x86/aacencdsp.asm     | 87 ++++++++++++++++++++++++++++++
> > ++++++++++
> > >  libavcodec/x86/aacencdsp_init.c  | 43 ++++++++++++++++++++
> > >  13 files changed, 170 insertions(+), 27 deletions(-)
> > >  create mode 100644 libavcodec/x86/aacencdsp.asm
> > >  create mode 100644 libavcodec/x86/aacencdsp_init.c
> >
> > fate passes on linux32/64 x86, mingw32/64 x86
> >
> > build fails on arm:
> >
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffserver_g] Error 1
> > make: *** Waiting for unfinished jobs....
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffprobe_g] Error 1
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffmpeg_g] Error 1
> >
> > [...]
> > --
> > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > While the State exists there can be no freedom; when there is freedom there
> > will be no State. -- Vladimir Lenin
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >
> Attaching a new version with the fixes from James Almer which should also
> fix non-x86 compilation

>  aaccoder.c            |   27 +++++++--------
>  aaccoder_trellis.h    |    2 -
>  aaccoder_twoloop.h    |    2 -
>  aacenc.c              |    4 ++
>  aacenc.h              |    6 +++
>  aacenc_is.c           |    6 +--
>  aacenc_ltp.c          |    4 +-
>  aacenc_pred.c         |    6 +--
>  aacenc_quantization.h |    4 +-
>  aacenc_utils.h        |    2 -
>  x86/Makefile          |    2 +
>  x86/aacencdsp.asm     |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  x86/aacencdsp_init.c  |   43 ++++++++++++++++++++++++
>  13 files changed, 170 insertions(+), 26 deletions(-)
> 84d67e14dbd62ef958a52a4027a8dff22f7480b6  0001-aacenc-add-SIMD-optimizations-for-abs_pow34-and-quan.patch
> From d92003e23d82bc40fd85712538983209a7704248 Mon Sep 17 00:00:00 2001
> From: Rostislav Pehlivanov <atomnuker at gmail.com>
> Date: Sat, 8 Oct 2016 15:59:14 +0100
> Subject: [PATCH] aacenc: add SIMD optimizations for abs_pow34 and quantization
> 
> Performance improvements:
> 
> quant_bands:
> with:     681 decicycles in quant_bands, 8388453 runs,    155 skips
> without: 1190 decicycles in quant_bands, 8388386 runs,    222 skips
> Around 42% for the function
> 
> Twoloop coder:
> 
> abs_pow34:
> with/without: 7.82s/8.17s
> Around 4% for the entire encoder
> 
> Both:
> with/without: 7.15s/8.17s
> Around 12% for the entire encoder
> 
> Fast coder:
> 
> abs_pow34:
> with/without: 3.40s/3.77s
> Around 10% for the entire encoder
> 
> Both:
> with/without: 3.02s/3.77s
> Around 20% faster for the entire encoder
> 
> Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> ---
>  libavcodec/aaccoder.c            | 27 ++++++------
>  libavcodec/aaccoder_trellis.h    |  2 +-
>  libavcodec/aaccoder_twoloop.h    |  2 +-
>  libavcodec/aacenc.c              |  4 ++
>  libavcodec/aacenc.h              |  6 +++
>  libavcodec/aacenc_is.c           |  6 +--
>  libavcodec/aacenc_ltp.c          |  4 +-
>  libavcodec/aacenc_pred.c         |  6 +--
>  libavcodec/aacenc_quantization.h |  4 +-
>  libavcodec/aacenc_utils.h        |  2 +-
>  libavcodec/x86/Makefile          |  2 +
>  libavcodec/x86/aacencdsp.asm     | 88 ++++++++++++++++++++++++++++++++++++++++
>  libavcodec/x86/aacencdsp_init.c  | 43 ++++++++++++++++++++
>  13 files changed, 170 insertions(+), 26 deletions(-)
>  create mode 100644 libavcodec/x86/aacencdsp.asm
>  create mode 100644 libavcodec/x86/aacencdsp_init.c

still fails to build on arm-qemu:
it looks like you call a function thats just not there on non x86
missing if (ARCH_X86) or #if i assume

LD      ffmpeg_g
libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
/home/michael/ffmpeg-git/ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to `ff_aac_dsp_init_x86'
collect2: ld returned 1 exit status
make: *** [ffmpeg_g] Error 1

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161018/704b33ba/attachment.sig>


More information about the ffmpeg-devel mailing list