[FFmpeg-devel] [PATCH v2] swscale/output: Altivec-optimize yuv2plane1_8

Lauri Kasanen cand at gmx.com
Wed Nov 21 14:35:32 EET 2018


On Wed, 21 Nov 2018 13:21:58 +0100
Michael Niedermayer <michael at niedermayer.cc> wrote:

> On Wed, Nov 21, 2018 at 10:12:48AM +0200, Lauri Kasanen wrote:
> > > ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \
> > > -f null -vframes 100 -v error -nostats -
> > > 
> > > 1158 UNITS in planar1,   65528 runs,      8 skips
> > > 
> > > -cpuflags 0
> > > 
> > > 19082 UNITS in planar1,   65533 runs,      3 skips
> > > 
> > > 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version
> > > takes as many cycles as the x86 SSE2 version, yikes it's fast.
> > > 
> > > Note that this function uses VSX instructions, but is not marked so.
> > > This is because several existing functions also make that mistake.
> > > I'll submit a patch moving them once this is reviewed.
> > > 
> > > v2: Remove !BE check
> > > Signed-off-by: Lauri Kasanen <cand at gmx.com>
> > 
> > Ping. Seems not many ffmpeg devs interested in ppc.
> 
> have you tried "make fate" with this patch (note you need to configure with
> fate samples" so all tests are run

I ran those fate tests containing "scale" in the name, I gather the
full suite takes > 20min. Otherwise I tested with a PNG to video
conversion on LE, and Carl Eugen Hoyos tested with Lena on BE.

- Lauri


More information about the ffmpeg-devel mailing list