[FFmpeg-devel] [PATCH 0/2] x86: hevc_mc: port to SSSE3

Christophe Gisquet christophe.gisquet at gmail.com
Sat Aug 23 15:22:33 CEST 2014

As far as I can see, the only reason those functions are SSE4 is because
of the pextrw needed for the following block widths:
- 2, used  only by chroma;
- 6, used by chroma and indirectly by luma;
- 12, used by both.
The better solution would be to convert all chroma handling to NV12, but
it is vastly simpler to modify the above cases to not use pextrw.

This is done in 2 steps:
- Fix width of 12 to do 8+4 instead of 6+6;
- Modify the store macros for width 2 and 6 by passing data through
  a GPR (alas at the cost for some functions of a supplementary GPR).

Christophe Gisquet (2):
  x86: hevc_mc: split differently calls
  x86: hevc_mc: convert to ssse3

 libavcodec/x86/hevc_mc.asm    |  63 +++--
 libavcodec/x86/hevcdsp.h      |  48 ++--
 libavcodec/x86/hevcdsp_init.c | 561 ++++++++++++++++++++++--------------------
 3 files changed, 362 insertions(+), 310 deletions(-)


More information about the ffmpeg-devel mailing list