[FFmpeg-devel] [RFC] flac_wasted32 vector implementation for VSX on ppc64le

Sean McGovern gseanmcg at gmail.com
Thu Jun 6 10:43:05 EEST 2024


Hi,

Attached inline is a _non-working_ implementation of flac_wasted32 for
VSX developed on a POWER9 in little-endian mode but probably just as
usable on POWER{8,10}.

I'm not sure why probably one of the simplest DSP functions in lavc
does not work for me, I imagine this is probably something endian
related even though IBM's documentation for vec_sl()[1] does not
suggest any.

Here's my code:

#define VSX_STRIDE 16

void ff_flac_wasted32_vsx(int32_t *decoded, int wasted, int len)
{
   register vec_s32 vec1;
   register vec_u32 vec2 = { wasted, wasted, wasted, wasted };
   register vec_s32 shifted;

   for (int i = 0; i < len; i += VSX_STRIDE) {
       vec1 = vec_vsx_ld(i, decoded);
       shifted = vec_sl(vec1, vec2);
       vec_vsx_st(shifted, i, decoded);
   }
}

Anyone with experience with AltiVec or VSX see something obvious I am missing?

-- Sean McGovern

[1] https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-sl


More information about the ffmpeg-devel mailing list