[FFmpeg-devel] [PATCH v2] swscale/output: Altivec-optimize yuv2plane1_8

Carl Eugen Hoyos ceffmpeg at gmail.com
Tue Nov 27 01:17:12 EET 2018


2018-11-17 9:12 GMT+01:00, Lauri Kasanen <cand at gmx.com>:
> ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt
> yuv420p \
> -f null -vframes 100 -v error -nostats -
>
> 1158 UNITS in planar1,   65528 runs,      8 skips
>
> -cpuflags 0
>
> 19082 UNITS in planar1,   65533 runs,      3 skips
>
> 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version
> takes as many cycles as the x86 SSE2 version, yikes it's fast.
>
> Note that this function uses VSX instructions, but is not marked so.
> This is because several existing functions also make that mistake.
> I'll submit a patch moving them once this is reviewed.
>
> v2: Remove !BE check
> Signed-off-by: Lauri Kasanen <cand at gmx.com>
> ---
>  libswscale/ppc/swscale_altivec.c | 53
> ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
>
> diff --git a/libswscale/ppc/swscale_altivec.c
> b/libswscale/ppc/swscale_altivec.c
> index 2fb2337..8c6056d 100644
> --- a/libswscale/ppc/swscale_altivec.c
> +++ b/libswscale/ppc/swscale_altivec.c
> @@ -324,6 +324,53 @@ static void hScale_altivec_real(SwsContext *c, int16_t
> *dst, int dstW,
>              }
>          }
>  }
> +
> +static void yuv2plane1_8_u(const int16_t *src, uint8_t *dest, int dstW,
> +                           const uint8_t *dither, int offset, int start)
> +{
> +    int i;
> +    for (i = start; i < dstW; i++) {
> +        int val = (src[i] + dither[(i + offset) & 7]) >> 7;
> +        dest[i] = av_clip_uint8(val);
> +    }
> +}
> +
> +static void yuv2plane1_8_altivec(const int16_t *src, uint8_t *dest, int
> dstW,
> +                           const uint8_t *dither, int offset)
> +{
> +    const int dst_u = -(uintptr_t)dest & 15;
> +    int i, j;
> +    LOCAL_ALIGNED(16, int16_t, val, [16]);

> +    const vector uint16_t shifts = (vector uint16_t) {7, 7, 7, 7, 7, 7, 7,
> 7};

The patch breaks compilation with xlc, sorry for not testing earlier:
libswscale/ppc/swscale_altivec.c:344:11: error: unknown type name 'vector'
    const vector uint16_t shifts = (vector uint16_t) {7, 7, 7, 7, 7, 7, 7, 7};

Carl Eugen


More information about the ffmpeg-devel mailing list