[FFmpeg-devel] libavcodec/exr : add X86 64 SIMD for reorder pixels (SSE and AVX2) (v4)
Martin Vignali
martin.vignali at gmail.com
Sun Sep 17 21:22:49 EEST 2017
Hello,
Following Henrik Grammer comments
new patch in attach
replace int size by ptrdiff_t size
I simplify the code, keeping only 1 loop (more easy to read, and doesn't
have a real impact on speed)
i use the SBUTTERFLY Macro for sse
for avx2 i keep my previous approach
Pass fate-exr tests for me (os X)
Current benchmark
AVX2
239920 decicycles in reorder_pixels_zip, 130958 runs, 114 skips
bench: utime=101.367s
SSE
283768 decicycles in reorder_pixels_zip, 130948 runs, 124 skips
bench: utime=101.424s
Scalar
3119101 decicycles in reorder_pixels_zip, 130429 runs, 643 skips
bench: utime=114.414s
The result of the suggested asm by Henrik
AVX2 :
258602 decicycles in reorder_pixels_zip, 130853 runs, 219 skips
SSE :
285167 decicycles in reorder_pixels_zip, 130863 runs, 209 skips
In term of speed using -benchmark, the difference with the current patch is
hard to see.
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-libavcodec-exr-add-X86-64-SIMD-for-reorder_pixels.patch
Type: application/octet-stream
Size: 15194 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170917/8236d737/attachment.obj>
More information about the ffmpeg-devel
mailing list