[FFmpeg-devel] [PATCH 0/5] RISC-V: Improve H264 decoding performance using RVV intrinsic
Lynne
dev at lynne.ee
Tue May 9 18:47:55 EEST 2023
May 9, 2023, 11:51 by arnie.chang at sifive.com:
> We are submitting a set of patches that significantly improve H.264 decoding performance
> by utilizing RVV intrinsic code. The average speedup(FPS) achieved by these patches is more than 2x,
> as experimented on 720P videos running on an internal FPGA board.
>
> Patch1: add support for RVV intrinsic code in the configure file
> Patch2: optimize chroma motion compensation
> Patch3: optimize luma motion compensation
> Patch4: optimize dsp functions, such as IDCT, in-loop filtering, and weighed filtering
> Patch5: optimize intra prediction
>
> Arnie Chang (5):
> configure: Add detection of RISC-V vector intrinsic support
> lavc/h264chroma: Add vectorized implementation of chroma MC for RISC-V
> lavc/h264qpel: Add vectorized implementation of luma MC for RISC-V
> lavc/h264dsp: Add vectorized implementation of DSP functions for
> RISC-V
> lavc/h264pred: Add vectorized implementation of intra prediction for
> RISC-V
>
Could you rewrite this in asm instead? I'd like for risc-v to have the same
policy like we do for arm - no intrinsics. There's a long list of reasons we
don't use intrinsics which I won't get into.
Just a few days ago, I discovered that our PPC intrinsics were quite badly
performing due to compiler issues, in some cases, 500x slower than C.
Also, we don't care about overall speedup. We have checkasm --bench
to measure the per-function speedup over C.
More information about the ffmpeg-devel
mailing list