[FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: R-V V mspel_pixels
flow gg
hlefthleft at gmail.com
Fri Mar 8 02:45:46 EET 2024
> Isn't it also faster to max LMUL for the adds here?
It requires the use of one more vset, making the time slightly longer:
147.7 (m1), 148.7 (m8 + vset).
Also this might not be much noticeable on C908, but avoiding sequential
dependencies on the address registers may help. I mean, avoid using as
address
operand a value that was calculated by the immediate previous instruction.
> Okay, but the test results haven't changed..
It would add more than ten lines of code, perhaps shorter code will better?
Rémi Denis-Courmont <remi at remlab.net> 于2024年3月8日周五 02:55写道:
> Le lauantaina 2. maaliskuuta 2024, 14.06.13 EET flow gg a écrit :
> > Here adjusting the order, rather than simply using .rept, will be 13%-24%
> > faster.
>
> Isn't it also faster to max LMUL for the adds here?
>
> Also this might not be much noticeable on C908, but avoiding sequential
> dependencies on the address registers may help. I mean, avoid using as
> address
> operand a value that was calculated by the immediate previous instruction.
>
> --
> Rémi Denis-Courmont
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-devel
mailing list