[FFmpeg-devel] [PATCH] some SIMD write-combining for h264

Michael Niedermayer michaelni
Sat Jan 16 06:35:58 CET 2010


On Fri, Jan 15, 2010 at 11:11:23PM -0500, Alexander Strange wrote:
> This adds intreadwrite macros for 64/128-bit memory operations and uses them in h264.
> 
> Unlike the other macros, these assume correct alignment, and the patch only defines the ones there was an immediate use for.
> This only has x86 versions, but others should be easy. The 64-bit operations can be done with double copies on most systems, I guess.
> 
> Decoding a 30s file on Core 2 Merom with --cpu=core2 (minimum of 5 runs):
> x86-32: 12.72s before, 12.51s after (1.7%)
> x86-64: 10.24s before, 10.20s after (.4%)
> 
> Tested on x86-32, x86-64, x86-32 with --arch=c.

as your code uses MMX you need to at least mention EMMS/float issue in the
dox and probably a  emms_c(); call before draw_horiz_band()
dunno if these are all

also what sets __MMX__ ? we have our own defines for that

and last
1.7% makes 50->48.3 % left to CoreAVC if we assume we are 50% behind

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100116/c66cdf5e/attachment.pgp>



More information about the ffmpeg-devel mailing list