[FFmpeg-devel] [PATCH 4/5] avcodec/h264: add avx 8-bit h264_idct_add

James Darnley jdarnley at obe.tv
Fri Apr 14 14:26:25 EEST 2017


On 2017-04-06 18:06, James Almer wrote:
> Your numbers are really confusing. Could you post the actual numbers for
> each function instead of doing comparisons?

These figures are the actual numbers!

Using the figures from Haswell above:
> ff_h264_idct_add_8_mmx  = 52 cycles
> ff_h264_idct_add_8_sse2 = 49 cycles
> ff_h264_idct_add_8_avx  = 46 cycles

Coming back to this draft I saved I removed a fair bit of ranting and
cut it down to the essential point.

Also, I forgot about the Pentium I tested previous patches on.  I added
SSE2.  From that commit message:
> Kaby Lake Pentium:
>  - ff_h264_idct_add_8_sse2:    ~1.18x faster than mmxext
>  - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext


More information about the ffmpeg-devel mailing list