[FFmpeg-devel] libavcodec/exr : add SIMD for reorder pixels (SSE and AVX2) v3 (WIP)

Martin Vignali martin.vignali at gmail.com
Sun Sep 10 18:17:49 EEST 2017


new version in attach
for simd optimization of reorder_pixels
(use by rle and zip uncompress)

pass fate test for me (on Mac Os X)

Tested with the decoding of a sequence of 150 HD Exr images (CGI render
with 17 layers per file in float pixel, ZIP16 compression)

AVX2, seems to provide only few speed improvment (if someone have an idea,
about how to improve)

The results :
Scalar :
2734448 decicycles in reorder_pixels_zip,  130476 runs,    596 skips
bench: utime=121.045s
bench: maxrss=608714752kB

 282900 decicycles in reorder_pixels_zip,  130935 runs,    137 skips
bench: utime=107.310s
bench: maxrss=615378944kB

AVX2   :
 247404 decicycles in reorder_pixels_zip,  130894 runs,    178 skips
bench: utime=107.182s
bench: maxrss=615391232kB

The overread is 1x mmsize (16 bytes in SSE, 32 in AVX2)
The overwrite is 2x mmsize (32 bytes in SSE, 64 in AVX2)

Comments Welcome

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-libavcodec-exr-add-SIMD-reorder_pixels-for-SSE2-and-.patch
Type: application/octet-stream
Size: 16053 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170910/58500a9e/attachment.obj>

More information about the ffmpeg-devel mailing list