[FFmpeg-devel] Anybody has a Core 2? [PATCH] Small SSSE3 optimization

Zuxy Meng zuxy.meng
Wed May 9 12:16:36 CEST 2007


Hi,

2007/5/9, Guillaume POIRIER <poirierg at gmail.com>:
> Hi,
>
> On 5/9/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > Hi,
> >
> > 2007/5/8, Zuxy Meng <zuxy.meng at gmail.com>:
> > > Hi,
> > >
> > > Attached patch makes use of SSSE3 instruction pabsw to calculate the
> > > absolute value of packed words. Just for fun. And I don't have a SSSE3
> > > capable CPU so hopefully someone with a Core 2 can help test it to
> > > ensure it doesn't break anything (better with benchmarks of course:-)
> > > ).
> >
> >
> > Updated patch against curren SVN HEAD. Full test passed on MMX2. Of
> > course it still needs testing under Core 2.
>
> cat /proc/cpuinfo
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 15
> model name      : Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
> stepping        : 6
> cpu MHz         : 2000.055
> cache size      : 4096 KB
> physical id     : 0
> siblings        : 2
> core id         : 0
> cpu cores       : 2
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 10
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm constant_tsc pni monitor ds_cpl vmx tm2 cx16 xtpr lahf_lm
> bogomips        : 4003.24
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 36 bits physical, 48 bits virtual
>
> [...]
>
> make codectest passes, make test passes, make fulltest passes.
>
> \o/ !!

Cool! Can u do a small unit-test to compare the MMX2 and SSSE3 version
of hadamard8_diff? Intel don't give the latency of pabsw in their
manuals (while AMD always give ALL instructions' latency & throughput)
but I guess it should be smaller than the sum of pxor, psubw and
pmaxsw:-)
-- 
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6




More information about the ffmpeg-devel mailing list