[FFmpeg-devel] [PATCH] some SIMD write-combining for h264

Alexander Strange astrange
Sun Jan 17 03:30:04 CET 2010

On Jan 16, 2010, at 12:35 AM, Michael Niedermayer wrote:

> On Fri, Jan 15, 2010 at 11:11:23PM -0500, Alexander Strange wrote:
>> This adds intreadwrite macros for 64/128-bit memory operations and uses them in h264.
>> Unlike the other macros, these assume correct alignment, and the patch only defines the ones there was an immediate use for.
>> This only has x86 versions, but others should be easy. The 64-bit operations can be done with double copies on most systems, I guess.
>> Decoding a 30s file on Core 2 Merom with --cpu=core2 (minimum of 5 runs):
>> x86-32: 12.72s before, 12.51s after (1.7%)
>> x86-64: 10.24s before, 10.20s after (.4%)
>> Tested on x86-32, x86-64, x86-32 with --arch=c.
> as your code uses MMX you need to at least mention EMMS/float issue in the
> dox and probably a  emms_c(); call before draw_horiz_band()
> dunno if these are all

Added in the comment.

> also what sets __MMX__ ? we have our own defines for that

It's a gcc builtin define, set based on ./configure --cpu=x adding -march.
HAVE_MMX is for the build and not the host cpu family, and this is inlined asm, so it can't use it.

> and last
> 1.7% makes 50->48.3 % left to CoreAVC if we assume we are 50% behind

Well, it turned out to work unusually well on that file (I had the idea when profiling it in the first place), closer to 1% on avchd-test-1.ts posted here a while ago.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-macros-for-write-combining-optimization.patch
Type: application/octet-stream
Size: 4820 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100116/7d22b69d/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-H.264-Use-64-and-128-bit-write-combining-macros.patch
Type: application/octet-stream
Size: 9637 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100116/7d22b69d/attachment-0001.obj>

More information about the ffmpeg-devel mailing list