[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

Ronald S. Bultje rsbultje at gmail.com
Mon May 20 18:52:56 EEST 2024


Hi,

one more, I forgot.

On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonechen at gmail.com> wrote:

> +pw_1: dw 1
>
[..]

> +    vpbroadcastw       m4, [pw_1]
>

We typically suggest to use vpbroadcastd, not w (and then pw_1: times 2 dw
1). agner shows that on e.g. Haswell, the former (d) is 1 uops with 5
cycles latency, whereas the latter (w) is 3 uops with 7 cycles latency, or
more generally d is faster then w.

Ronald


More information about the ffmpeg-devel mailing list