[FFmpeg-devel] [PATCH] NEON FFT/IMDCT
Thu Sep 10 10:55:16 CEST 2009
Naotoshi Nojiri <naonoj at gmail.com> writes:
> 2009/9/8 M?ns Rullg?rd <mans at mansr.com>:
>> M?ns Rullg?rd <mans at mansr.com> writes:
>>> Naotoshi Nojiri <naonoj at gmail.com> writes:
>>>> I tested the patch on Cortex-A8 @500MHz (BeagleBoard).
>>>> FFT (fft-test -s):
>>>> 440.8 -> 34.2 us/transform (12.9x speed up)
>>>> IMDCT (fft-test -i -m -s):
>>>> 142.4 -> 11.8 us/transform (12.1x speed up)
>>>> I had written NEON intrinsics code a bit, but this is my first
>>>> ARM/NEON code in assembly.
>>>> So, any comments and suggestions would be appreciated.
>>> Inline asm is unacceptable.
>> I have a faster, pure-asm version of the mdct stuff almost ready. ?No
>> need to resubmit.
> Thank you for all of your comments and advices. I revised the patch
> The latest performance is as follows.
> FFT (fft-test -s):
> IMDCT (fft-test -i -m -s):
> I also wrote a pure-asm version of MDCT, but because it doesn't
> improve the performance, please ignore the part and use the FFT part
Thanks. I've committed a slightly improved variant.
mans at mansr.com
More information about the ffmpeg-devel