[FFmpeg-devel] [RFC] optimize ff_emulated_edge_mc

Michael Niedermayer michaelni
Thu Dec 30 11:26:24 CET 2010


On Wed, Dec 29, 2010 at 10:03:04PM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, Dec 29, 2010 at 8:06 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> > emu_edge_mc looks optimizable and shows up in my profilings. A simple
> > loop->memcpy makes things a lot faster already (see attached):
> [..]
> > after
> [..]
> > 6165 dezicycles in ff_emulated_edge_mc, 1048040 runs, 536 skips
> > 6115 dezicycles in ff_emulated_edge_mc, 1048044 runs, 532 skips
> > 6087 dezicycles in ff_emulated_edge_mc, 1048158 runs, 418 skips
> >
> > before
> [..]
> > 9104 dezicycles in ff_emulated_edge_mc, 1047805 runs, 771 skips
> > 9131 dezicycles in ff_emulated_edge_mc, 1047866 runs, 710 skips
> > 9097 dezicycles in ff_emulated_edge_mc, 1047874 runs, 702 skips
> [..]
> 
> Another few more changes attached, doing memcpy() on top/bottom edge
> brings it to 540 cycles:
> 
> 5414 dezicycles in ff_emulated_edge_mc, 1048331 runs, 245 skips
> 
> and then reordering the left/right edge loop a little brings it to 520:
> 
> 5186 dezicycles in ff_emulated_edge_mc, 1048288 runs, 288 skips
> 
> I'm too lazy to run this multiple times.
> 
> For the left/right edge fills, I tried using memset(), but that slows
> it down considerably, it appears it doesn't inline it. Jason said he
> saw the same on some compilers withthe memcpy() trick. Which makes me
> think, maybe we can emulate the inline memset() trick with some more
> elaborate C code? What I'm thinking is basically edge_val *=
> 0x01010101U; while (to_write >= 4) write(edge_val); if (to_write&2)
> write(edge_val); if (to_write & 1) write(edge_val); or so. Also, since
> most time is spent in copying the blocks quite literally, the main
> copy block could certainly use some optimizations, especially since
> width is generally something like 16...
> 
> Ronald

>  dsputil.c |   22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 6b5be1a69247178dd53af1f622a49750d231045d  emu_edge_mc.patch

feel free to commit whatever makes ff_emulated_edge_mc() faster

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many that live deserve death. And some that die deserve life. Can you give
it to them? Then do not be too eager to deal out death in judgement. For
even the very wise cannot see all ends. -- Gandalf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101230/a586ef6a/attachment.pgp>



More information about the ffmpeg-devel mailing list