[FFmpeg-devel] [PATCH] ac3enc: Add x86-optimized function to speed up log2_tab().

Justin Ruggles justin.ruggles
Sun Feb 13 20:49:50 CET 2011

AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.
Updated patch based on comments from Mans, Loren, and Ronald.

Added range constraint to function documentation.

Loren's suggestion of using min/max when available is faster.

Using the min/max approach for the C version is about 15% faster on
Athlon64 but 30% slower on Atom.  The existing version is simpler so I
just left it as-is.

Ronald's suggestion of using shuffles+por instead of doing the final
calculations from the stack is faster in some situations and about the
same in others.  But it's simpler overall and avoids messing around
with the stack so I used it.

Athlon64 X2 6000+:
   C: 20718
 MMX:  3590
MMX2:  2906
SSE2:  2062

Atom 330:
    C: 31838
  MMX:  7394
 SSE2:  3138
SSSE3:  2759

 libavcodec/ac3dsp.c         |    9 +++++
 libavcodec/ac3dsp.h         |   11 +++++++
 libavcodec/ac3enc_fixed.c   |   11 ++-----
 libavcodec/x86/ac3dsp.asm   |   69 +++++++++++++++++++++++++++++++++++++++++++
 libavcodec/x86/ac3dsp_mmx.c |   11 +++++++
 5 files changed, 103 insertions(+), 8 deletions(-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ac3enc-Add-x86-optimized-function-to-speed-up-log2_t.patch
Type: text/x-patch
Size: 6326 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110213/bda0851e/attachment.bin>

More information about the ffmpeg-devel mailing list