[FFmpeg-devel] [PATCH] vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl.
Ronald S. Bultje
rsbultje at gmail.com
Mon Oct 12 16:25:34 CEST 2015
Hi,
On Sat, Oct 10, 2015 at 12:31 PM, Henrik Gramner <henrik at gramner.com> wrote:
> On Tue, Oct 6, 2015 at 9:59 PM, Ronald S. Bultje <rsbultje at gmail.com>
> wrote:
> > +cglobal vp9_idct_idct_4x4_add_12, 4, 4, 6, dst, stride, block, eob
> [...]
> > + movd m0, coefd
> > + punpcklwd m0, m0
> > + pshufd m0, m0, q0000
>
> pshuflw + punpcklqdq is faster on some older CPUs, such as Conroe.
Done.
Ronald
More information about the ffmpeg-devel
mailing list