[FFmpeg-devel] [PATCH 1/5] x86: hevc_mc: split differently calls

Michael Niedermayer michaelni at gmx.at
Sun Aug 24 12:11:24 CEST 2014


On Sun, Aug 24, 2014 at 08:46:30AM +0000, Christophe Gisquet wrote:
> In some cases, 2 or 3 calls are performed to functions for unusual
> widths. Instead, perform 2 calls for different widths to split the
> workload.
> 
> The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't
> be processed that way without modifications: some calls use unaligned
> buffers, and having branches to handle this was resulting in no
> micro-benchmark benefit.
> 
> For block_w == 12 (around 1% of the pixels of the sequence):
> Before:
> 12758 decicycles in epel_uni, 4093 runs, 3 skips
> 19389 decicycles in qpel_uni, 8187 runs, 5 skips
> 22699 decicycles in epel_bi, 32743 runs, 25 skips
> 34736 decicycles in qpel_bi, 32733 runs, 35 skips
> 
> After:
> 11929 decicycles in epel_uni, 4096 runs, 0 skips
> 18131 decicycles in qpel_uni, 8184 runs, 8 skips
> 20065 decicycles in epel_bi, 32750 runs, 18 skips
> 31458 decicycles in qpel_bi, 32753 runs, 15 skips
> ---
>  libavcodec/x86/hevcdsp_init.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)

applied
thanks

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140824/feb5e59c/attachment.asc>


More information about the ffmpeg-devel mailing list