[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Carl Eugen Hoyos cehoyos at ag.or.at
Mon Jul 4 19:30:00 EEST 2016


Dan Parrot <dan.parrot <at> mail.com> writes:

> > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero... 
> > showed different results?
> > I believe this should be both easier and faster to test.
>
> Sorry, I don't understand what that command line just above 
> is trying to achieve. Could you elaborate?

Instead of running the whole fate suite that takes long and 
does not test libswscale for most commands, just test an 
ffmpeg command line that only tests libswscale:
$ ffmpeg -benchmark -f rawvideo -pix_fmt rgb24 
-i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -

vs

$ ffmpeg -cpuflags 0 -benchmark -f rawvideo -pix_fmt rgb24 
-i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -

[...]

> Surprisingly, gcc is producing some badly suboptimal assembly.

Just to make sure I don't misunderstand:
Does this mean intrinsics are suboptimal to write assembly 
code?

> > Can you confirm with START_TIMER / STOP_TIMER that there is no 
> > gain?
>
> SystemTap probes provide identical functionality by measuring 
> deltas between function entry and function return.

Sorry, I don't understand:
Did you test with both methods to verify that they provide 
the same results?

Note that if it turns out that START_TIMER / STOP_TIMER 
cannot be used on ppc64 (le) this would be important 
information for us.

Carl Eugen



More information about the ffmpeg-devel mailing list