[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Dominik 'Rathann' Mierzejewski dominik at greysector.net
Thu Jul 7 18:55:01 EEST 2016


On Thursday, 07 July 2016 at 14:51, Ronald S. Bultje wrote:
[...]
> - start with one function. Take a really simple one. Don't do 20 at a time.
> Especially if this is your first time writing ppc64 assembly.
> - measure speedups on other archs with similar register width. Best
> example: measure SSE2 vs. C.
> - make sure you're measuring scalar C when measuring the base speed, since
> x86 C vs. SSE2 is also scalar C vs. vector SIMD. There might be other
> functions being picked up that we don't know about (some altivec is
> BE-aware; your compiler might be auto-vectorizing C code.
> - optimize your one function. Start with ideas taken from the x86 SSE2
> code. Use all things learned from x86 basics (do aligned loads where
> possible, limit shuffles/data rearrangements, load constants outside loop,
> etc.).
> - measure. Use START/STOP_TIMER, nothing else, around the caller
> with/without -cpuflags 0 and look only at the last reported cycle count
> line.
> - make changes. Measure again. Repeat. Do this with all suggestions from
> code review also. Your test should be ultra-fast, something that takes 10
> seconds but invokes the function millions of times. If unsure, write a test
> in checkasm, but usually one invocation from a fate test is good enough.
> - if this is your first time writing assembly, you'll get tons of review
> comments. This is normal, and we've all been through it. You'll become a
> better coder for it, so learn from it, deal with it and keep submitting
> patches until it's done. A few years from now, you'll be the expert
> reviewer and an ever newer contributor will not yet know that he's about to
> get learn some extremely important lessons from an experienced expert - you.
> - once your first few individual functions are in, it may make sense to
> submit sets of functions that are somehow related. However, this increases
> review load so only do this once we know that you know what you're doing.

I think the above is very well written and could actually be used as
a guide for new contributors. Thanks, Ronald.

Regards,
Dominik
-- 
MPlayer http://mplayerhq.hu | RPM Fusion http://rpmfusion.org
There should be a science of discontent. People need hard times and
oppression to develop psychic muscles.
	-- from "Collected Sayings of Muad'Dib" by the Princess Irulan


More information about the ffmpeg-devel mailing list