[FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le
Sean McGovern
gseanmcg at gmail.com
Thu Jun 6 10:43:05 EEST 2024
Hi,
Attached inline is a _non-working_ implementation of flac_wasted32 for
VSX developed on a POWER9 in little-endian mode but probably just as
usable on POWER{8,10}.
I'm not sure why probably one of the simplest DSP functions in lavc
does not work for me, I imagine this is probably something endian
related even though IBM's documentation for vec_sl()[1] does not
suggest any.
Here's my code:
#define VSX_STRIDE 16
void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
{
register vec_s32 vec1;
register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
register vec_s32 shifted;
for (int i = 0; i < len; i += VSX_STRIDE) {
vec1 = vec_vsx_ld(i, decoded);
shifted = vec_sl(vec1, vec2);
vec_vsx_st(shifted, i, decoded);
}
}
Anyone with experience with AltiVec or VSX see something obvious I am missing?
-- Sean McGovern
[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl
More information about the ffmpeg-devel
mailing list