[FFmpeg-devel] [RFC/RFBench] AVX FFT

Vitor Sessak vitor1001 at gmail.com
Fri Apr 1 19:12:47 CEST 2011


The following patches add an AVX (an intel x86 extension) FFT 
implementation. Since I do not have a Sandybridge myself, I have no idea 
of its performance. Benchmarks (for ex., using fft-test -s) are thus 
very welcome. Also welcome are suggestions for optimizing it further, in 
particular the 8 point FFT (in the T8_AVX macro), which is not much 
faster than the SSE version.

One thing noteworthy about AVX is that it uses 256 bits registers, so 
now av_malloc needs to align the pointers to 32-byte boundaries. If this 
patch is accepted, I'll have to change a bunch of audio decoders to 
increase their buffers' alignment (note that AVX does not crash if a 
256-bit load is done on a 128-bit aligned pointer, but it will cause a 
cache miss and thus a performance hit).


PS: cross-posted to both lists since I'm interested in feedback from 
both groups.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Change-x86-asm-FFT-permutation-to-later-AVX-FFT-addi.patch
Type: text/x-patch
Size: 1888 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110401/0fd09670/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-AVX-FFT-implementation.patch
Type: text/x-patch
Size: 17178 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110401/0fd09670/attachment-0001.bin>

More information about the ffmpeg-devel mailing list