[FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER
Sean McGovern
gseanmcg at gmail.com
Sat Jul 6 23:00:47 EEST 2024
Hi,
On Thu, Jul 4, 2024, 13:54 Rémi Denis-Courmont <remi at remlab.net> wrote:
> Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit :
> > Is that correlated with the comment above re: len? Or is it more general
> > that I should unroll until I've exhausted the available vector registers?
>
> You should unroll if it improves bandwidth.
>
> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
After adding a 2nd set of load/left shift/store it was diminishing/no
returns for more unrolling. I'll send the updated version later.
Does wasted32 (and I guess wasted33 by proxy) not have to worry about loops
tails? I noticed the other vectorized versions don't do anything special in
that regard.
-- Sean McGovern
>
More information about the ffmpeg-devel
mailing list