[FFmpeg-devel] [PATCH] VP8 MMX optimizations (MC and IDCT dc_add)
Fri Jun 25 12:25:12 CEST 2010
>On Thu, Jun 24, 2010 at 6:33 PM, Jason Garrett-Glaser <
darkshikari at gmail.com> wrote:
> > Now with 8x8 intra pred modes and non-broken line endings. Did I
> > mention this makes h264 faster too?
> > Dark Shikari
> And one more SSSE3 function, because pshufb is amazing.
yes, but 4 cycles on atom.
On Tue, Jun 22, 2010 at 3:29 PM, Michael Niedermayer <michaelni at gmx.at>
>> + punpcklbw mm2, mm6
>> + ; first tap
>> + pshufw mm3, mm7, 0x0 ; splat first coeff
> are you sure all these pshufw are faster than reading them from a table?
Ronald's pshufw follow unpacks, so the tables would be twice as big.
On Atom, cache and memory are dirt slow, so keeping the tables small makes
More information about the ffmpeg-devel