[FFmpeg-devel] [PATCH 3/9] x86: simple_idct_put: 10bits versions

Michael Niedermayer michael at niedermayer.cc
Fri Oct 9 19:42:02 CEST 2015


On Thu, Oct 08, 2015 at 08:22:50AM +0200, Christophe Gisquet wrote:
> Modeled from the prores version. Clips to [0;1023] and is bitexact.
> Bitexactness requires to add an offset in a different place compared
> to prores or C, and makes the function approximately 2% slower.
> 
> For 16 frames of a DNxHD 4:2:2 10bits test sequence:
> 
> C:    60861 decicycles in idct, 1048205 runs,    371 skips
> sse2: 27567 decicycles in idct, 1048216 runs,    360 skips
> avx:  26272 decicycles in idct, 1048171 runs,    405 skips
> ---
>  libavcodec/x86/Makefile                   |  1 +
>  libavcodec/x86/idctdsp_init.c             | 16 ++++++++++
>  libavcodec/x86/simple_idct.h              |  3 ++
>  libavcodec/x86/simple_idct10.asm          | 53 +++++++++++++++++++++++++++++++
>  libavcodec/x86/simple_idct10_template.asm | 12 +++++++
>  5 files changed, 85 insertions(+)
>  create mode 100644 libavcodec/x86/simple_idct10.asm

breaks (something with scantables from how it looks)
./ffplay -f lavfi  testsrc -vf spp=1:40,format=yuv420p10

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151009/b6e00aa5/attachment.sig>


More information about the ffmpeg-devel mailing list