[FFmpeg-devel] [PATCH] move H264 IDCT to yasm
Ronald S. Bultje
Mon Sep 6 23:16:21 CEST 2010
On Mon, Sep 6, 2010 at 5:00 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> this patch moves H264 IDCT (the LGPL part) to yasm. Performance for
> most loopy parts is improved quite a bit because gcc is completely
> retarded when it comes to setting up loops (I'm not joking here), some
> up to 50%. Performance for one particular function (intra16_mmx2) is
> mildly worse (a few cycles) and I don't quite understand why, the code
> is identical. This might be related to alignment (gcc aligns the parts
> that it jmps to using nops, I don't yet know how to do that in yasm),
> otherwise I don't really know. Let me know if you want detailed
> performance statistics for each function.
Please ignore the last part to h264dsp_mmx.c that comments out the GPL
function pointers, that was for testing only and is fixed locally.
More information about the ffmpeg-devel