[FFmpeg-devel] [PATCH 1/9] SBR DSP x86: implement SSE qmf_pre_shuffle
Michael Niedermayer
michaelni at gmx.at
Fri Apr 5 15:56:14 CEST 2013
On Thu, Apr 04, 2013 at 07:45:45PM +0000, Christophe Gisquet wrote:
> From 253 to 70c on Arrandale and Win64.
> ---
> libavcodec/x86/sbrdsp.asm | 33 +++++++++++++++++++++++++++++++++
> libavcodec/x86/sbrdsp_init.c | 2 ++
> 2 files changed, 35 insertions(+)
>
> diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm
> index 1b7f3a8..2029b45 100644
> --- a/libavcodec/x86/sbrdsp.asm
> +++ b/libavcodec/x86/sbrdsp.asm
> @@ -220,3 +220,36 @@ cglobal sbr_qmf_post_shuffle, 2,3,4,W,z
> cmp zq, r2q
> jl .loop
> REP_RET
> +
> +INIT_XMM sse
> +cglobal sbr_qmf_pre_shuffle, 1,4,7,z
> +%define OFFSET (32*4-2*mmsize)
> + mov r3q, OFFSET
> + lea r1q, [zq + (32+1)*4]
> + lea r2q, [zq + 64*4]
> + mova m6, [ps_neg]
> +.loop:
> + movu m0, [r1q]
> + movu m2, [r1q + mmsize]
> + movu m1, [zq + r3q + 4 + mmsize]
> + movu m3, [zq + r3q + 4]
> + xorps m2, m6
> + xorps m0, m6
> + shufps m2, m2, q0123
> + shufps m0, m0, q0123
> + mova m5, m2
> + mova m4, m0
> + unpcklps m2, m3
> + unpckhps m5, m3
> + unpcklps m0, m1
> + unpckhps m4, m1
> + mova [r2q + 2*r3q + 0*mmsize], m2
> + mova [r2q + 2*r3q + 1*mmsize], m5
> + mova [r2q + 2*r3q + 2*mmsize], m0
> + mova [r2q + 2*r3q + 3*mmsize], m4
> + add r1q, 2*mmsize
> + sub r3q, 2*mmsize
> + jge .loop
> + mova m2, [zq]
> + movlps [r2q], m2
using simpler memory indexing ([r2q + n*mmsize] and [zq])
and incremeanting them seperately seems 1-2 cpu cycles faster here
on sandybridge
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In fact, the RIAA has been known to suggest that students drop out
of college or go to community college in order to be able to afford
settlements. -- The RIAA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130405/2666b372/attachment.asc>
More information about the ffmpeg-devel
mailing list