[FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm
Rémi Denis-Courmont
remi at remlab.net
Thu Dec 21 18:07:55 EET 2023
Le maanantaina 18. joulukuuta 2023, 17.16.27 EET flow gg a écrit :
> C908:
> decorrelate_sm_c: 130.0
> decorrelate_sm_rvv_i32: 43.7
+
+func ff_decorrelate_sm_rvv, zve32x
+1:
+ vsetvli t0, a2, e32, m8, ta, ma
+ vle32.v v0, (a0)
+ sub a2, a2, t0
+ vle32.v v8, (a1)
+ vsra.vi v16, v8, 1
You should load v8 first, since it is used as input before v0.
+ vsub.vv v0, v0, v16
+ vse32.v v0, (a0)
+ sh2add a0, t0, a0
+ vadd.vv v0, v0, v8
You can use VSSRA, and then VADD won't need to depend on the output of VSUB.
+ vse32.v v0, (a1)
+ sh2add a1, t0, a1
+ bnez a2, 1b
+ ret
+endfunc
--
雷米‧德尼-库尔蒙
http://www.remlab.net/
More information about the ffmpeg-devel
mailing list