[FFmpeg-devel] libavcodec/exr : add X86 64 SIMD for reorder pixels (SSE and AVX2) (v4)

Martin Vignali martin.vignali at gmail.com
Sun Sep 17 21:22:49 EEST 2017


Hello,

Following Henrik Grammer comments
new patch in attach

replace int size by ptrdiff_t size

I simplify the code, keeping only 1 loop (more easy to read, and doesn't
have a real impact on speed)
i use the SBUTTERFLY Macro for sse
for avx2 i keep my previous approach

Pass fate-exr tests for me (os X)

Current benchmark
AVX2
239920 decicycles in reorder_pixels_zip,  130958 runs,    114 skips
bench: utime=101.367s

SSE
283768 decicycles in reorder_pixels_zip,  130948 runs,    124 skips
bench: utime=101.424s

Scalar
3119101 decicycles in reorder_pixels_zip,  130429 runs,    643 skips
bench: utime=114.414s


The result of the suggested asm by Henrik
AVX2 :
258602 decicycles in reorder_pixels_zip,  130853 runs,    219 skips

SSE :
285167 decicycles in reorder_pixels_zip,  130863 runs,    209 skips

In term of speed using -benchmark, the difference with the current patch is
hard to see.


Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-libavcodec-exr-add-X86-64-SIMD-for-reorder_pixels.patch
Type: application/octet-stream
Size: 15194 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170917/8236d737/attachment.obj>


More information about the ffmpeg-devel mailing list