[FFmpeg-devel] [PATCH] Add macros used in opus_pvq_search to x86util.asm

Henrik Gramner henrik at gramner.com
Sun Aug 6 14:12:23 EEST 2017


On Sat, Aug 5, 2017 at 9:10 PM, Ivan Kalvachev <ikalvachev at gmail.com> wrote:
> +%macro VBROADCASTSS 2 ; dst xmm/ymm, src m32/xmm
> +%if cpuflag(avx2)
> +    vbroadcastss  %1, %2                    ; ymm, xmm
> +%elif cpuflag(avx)
> +    %ifnum sizeof%2         ; avx1 register
> +        vpermilps  xmm%1, xmm%2, q0000      ; xmm, xmm, imm || ymm, ymm, imm

Nit: Use shufps instead of vpermilps, it's one byte shorter but
otherwise identical in this case.

c5 e8 c6 ca 00    vshufps xmm1,xmm2,xmm2,0x0
c4 e3 79 04 ca 00 vpermilps xmm1,xmm2,0x0

> +%macro BLENDVPS 3 ; dst/src_a, src_b, mask
> +%if cpuflag(avx)
> +    blendvps  %1, %1, %2, %3
> +%elif cpuflag(sse4)
> +    %if notcpuflag(avx)
> +        %ifnidn %3,xmm0
> +            %error sse41 blendvps uses xmm0 as default 3d operand, you used %3
> +        %endif
> +    %endif

notcpuflag(avx) is redundant (it's always true since AVX uses the first branch).


More information about the ffmpeg-devel mailing list