[FFmpeg-devel] av_resample_init() optimization

Olivier Guilyardi list
Fri Jan 8 00:36:29 CET 2010

On 01/08/2010 12:08 AM, M?ns Rullg?rd wrote:
> Olivier Guilyardi <list at samalyse.com> writes:
>> On 01/06/2010 11:43 PM, Michael Niedermayer wrote:
>>> On Wed, Jan 06, 2010 at 07:34:42PM +0100, Olivier Guilyardi wrote:
>>>> I'm using libavcodec resampling on an Android ARM (Qualcomm
>>>> MSM7200A 528 MHz) device, without FPU support. The fact that it's
>>>> integer-based is really great for performances (this is for a
>>>> closed source application, but I plan full LGPL compliance with
>>>> dynamic linking).
>>>> Nevertheless, there is an overhead upon initialization, when calling:
>>>> av_resample_init(44100, 48000, 16, 10, 0, 0.8).
>>>> This comes from av_build_filter() which takes about 6.9 *seconds*
>>>> to build the polyphase filterbank.
>>> uhm 7 seconds
>>>> About 300ms (4%) can be saved by applying resample-xfactor.1.patch
>>>> to r21036. It caches some constant factor, and prevent bessel()
>>>> and sqrt() from being called when unnecessary. I'm not sure
>>>> whether the cached factor affects accuracy.
>>> dont use kaiser windows if you have no FPU or optimize the code
>>> properly its surely possibly to do this faster with table and cubic
>>> interpolation also i think your float/double emulation might be
>>> trash because even without FPU 7sec appears quite extreem, it just
>>> calculates ~16k samples of a function on a ~500mhz cpu in 7sec
>>> thats 200k cpu cycles for a single value
>> Well, I tried the cubic type by settings WINDOW_TYPE to 0, and calling
>> av_resample_init() as above. Initialization was indeed much faster, but it
>> sounded terrible. Any advice on the parameters?
>> In regard to floating emulation, I'm using gcc 4.2.1 bundled with the Android
>> NDK, which reports itself as:
>> [...]
>> And resample2.c is compiled with (I don't have full control on these):
>> build/prebuilt/linux-x86/arm-eabi-4.2.1/bin/arm-eabi-gcc
>> -Ibuild/platforms/android-3/arch-arm/usr/include -march=armv5te -mtune=xscale
>> -msoft-float -fpic -mthumb-interwork -ffunction-sections -funwind-tables
>> -fstack-protector -fno-short-enums -D__ARM_ARCH_5__ -D__ARM_ARCH_5T__
>> -D__ARM_ARCH_5E__ -D__ARM_ARCH_5TE__  -O2 -fomit-frame-pointer -fstrict-aliasing
>> -funswitch-loops -finline-limit=300
>> -I/home/olivier/dev/android/ar/project/jni/../../lib/ffmpeg/android/..
>> -I/home/olivier/dev/android/ar/project/jni/../../lib/ffmpeg/android -DANDROID
>> -Wall -Drestrict="" -DCONFIG_RESAMPLE_FAST_INIT -O2 -DNDEBUG -g  -c -MMD -MP -MF
>> out/apps/audiorec//objs/avcodec-resample/../libavcodec/resample2.o.d.tmp
>> /home/olivier/dev/android/ar/project/jni/../../lib/ffmpeg/android/../libavcodec/resample2.c
>> -o out/apps/audiorec//objs/avcodec-resample/../libavcodec/resample2.o
> Why are you not building FFmpeg with its native build system?  Those
> flags are basically insane for your device.

Because I'm building for the Android platform, that is: a lot of devices with
various ARM CPUs, the one I'm using being only one of them. So I use the Android
NDK build system which is supposed to ensure portability over all of those
devices. Plus I need to save space, so I only compile resample2.c and mem.c.

>>From my googling, the MSM7200A has an ARM1136 processor with VFP
> support, so you do a hell of a lot better with a properly configured
> FFmpeg.

I mentioned the type of processor and its speed only as a reference for my
timing measures. It is not my only target. As previously said, my application
may run on some other cpu. Some Android devices have an FPU, some don't. This is
why the Android NDK enforce -msoft-float, and at this point in time there's
really nothing I can do about that.

Hence my patches, which with a little tweaking, makes it possible to initialize
two rather high-quality integer-based resamplers in 1.5s instead of 15s...


More information about the ffmpeg-devel mailing list