[Ffmpeg-devel] [PATCH] fix mpeg4 lowres chroma bug and increase h264/mpeg4 MC speed

Oleg Metelitsa oleg
Thu Feb 8 10:43:24 CET 2007


Hello Trent,

What  about  to  change  two  MMX  instruction to one SSE command as I
proposed here:
http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-February/052187.html

As a result we will have

#define H264_CHROMA_OP2(S,D,T)   "pinsrw $1, 2+" #S ", " #D " \n\t"

instead of

>> +#define H264_CHROMA_OP2(S,D,T) "movd 2+" #S ", " #T "\n\t"\
>> +                               "punpcklwd " #T ", " #D "\n\t"

============================================


>> I benchmarked my version, by measuring put_h264_chroma_mc2_mmx2() from
>> start to finish with rdtsc, as 12.8% faster than before.

Speed was increased because memory-to-cache operation is faster during
reading  comparing  to  writing. So, you are making fast cache preload
and  it  speeds up the code. Is the patch faster if you test it during
decode operation but not alone?

Oleg







More information about the ffmpeg-devel mailing list