[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

Clément Bœsch u at pkh.me
Sun Nov 2 23:43:12 CET 2014


On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote:
> Two to four times faster depending on instruction set, block size and channel count.
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
> TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 channels.
>       AVX2 and maybe MMX versions.
>       Planar?
> 
>  libavcodec/arm/flacdsp_init_arm.c |   2 +-
>  libavcodec/flacdec.c              |   6 +-
>  libavcodec/flacdsp.c              |   6 +-
>  libavcodec/flacdsp.h              |   6 +-
>  libavcodec/flacenc.c              |   2 +-
>  libavcodec/x86/flacdsp.asm        | 206 ++++++++++++++++++++++++++++++++++++++
>  libavcodec/x86/flacdsp_init.c     |  48 ++++++++-
>  7 files changed, 264 insertions(+), 12 deletions(-)
[...]
> +    mova       m0, [in0q]
> +    mova       m1, [in0q+in1q]
> +%if %1 > 2
> +    mova       m2, [in0q+in2q]
> +    mova       m3, [in0q+in3q]
> +%if %1 > 4
> +    mova       m4, [in0q+in4q]
> +    mova       m5, [in0q+in5q]
> +%endif
> +%endif
> +    pslld      m0, m%2
> +    pslld      m1, m%2
> +%if %1 > 2
> +    pslld      m2, m%2
> +    pslld      m3, m%2
> +%if %1 > 4
> +    pslld      m4, m%2
> +    pslld      m5, m%2
> +%endif
> +%endif

Can't you do something like this? (untested)
    pslld      m0, [in0q], m%2
    %assign i 0
    %rep %1
    pslld      m%i, [in0q+in%iq], m%2
    %assign    i i+1
    %endrep

[...]

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20141102/5fa01cfd/attachment.asc>


More information about the ffmpeg-devel mailing list