[FFmpeg-devel] [PATCH] lavc/vvc_mc: reduce sequential dependency in R-V V sad
flow gg
hlefthleft at gmail.com
Sat Dec 21 14:22:40 EET 2024
> Don't clobber v8 here.
> Use vsub.vv here to avoid the sequential dependency.
Updated.
<uk7b-at-foxmail.com at ffmpeg.org> 于2024年12月21日周六 20:22写道:
> From: sunyuechi <sunyuechi at iscas.ac.cn>
>
> ---
> libavcodec/riscv/vvc/vvc_sad_rvv.S | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/libavcodec/riscv/vvc/vvc_sad_rvv.S
> b/libavcodec/riscv/vvc/vvc_sad_rvv.S
> index 341167be1f..f325deee17 100644
> --- a/libavcodec/riscv/vvc/vvc_sad_rvv.S
> +++ b/libavcodec/riscv/vvc/vvc_sad_rvv.S
> @@ -36,20 +36,20 @@ func ff_vvc_sad_rvv_\vlen, zve32x, zbb, zba
> SADVSET\vlen\w:
> vsetvlstatic32 \w, \vlen
> vmv.v.i v0, 0
> - vmv.s.x v24, zero
> vsetvlstatic16 \w, \vlen
> SAD\vlen\w:
> addi a5, a5, -2
> vle16.v v8, (a0)
> vle16.v v16, (a1)
> - vsub.vv v8, v8, v16
> - vneg.v v16, v8
> + vsub.vv v24, v8, v16
> + vsub.vv v16, v16, v8
> addi a0, a0, 2 * 128 * 2
> - vmax.vv v8, v8, v16
> - vwaddu.wv v0, v0, v8
> + vmax.vv v8, v24, v16
> addi a1, a1, 2 * 128 * 2
> + vwaddu.wv v0, v0, v8
> bnez a5, SAD\vlen\w
> vsetvlstatic32 \w, \vlen
> + vmv.s.x v24, zero
> vredsum.vs v24, v0, v24
> vmv.x.s a0, v24
> ret
> --
> 2.47.1
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-devel
mailing list