[FFmpeg-devel] [PATCH] Optimization of original IFF codec

Sebastian Vater cdgs.basty
Sun Apr 25 13:49:54 CEST 2010

Hey to all!

I have a new (and my first git patch ;-)) ready for optimizing the stuff.
I also did move some if's out of a critical loop (checking whether we
have 8 bit or 32 bit output, as well as interleaved).

This elimitates most of the inner-loop branches and thus reduces stalls
because of wrong branch prediction, which is quite expensive.

Michael Niedermayer a ?crit :
> amongth all these optimizations, i am wondering how much faster things become
> does that inline speed the code up?
> does the changing to unsigned?
> you can test easily by using the START/STOP_TIMER makros
I was relooking at that piece of code again and just found that the
division is not required at all.

Unsigned changes because it allows to assume the compiler that it can
replace * 8 with << 3.
> (buf_size * 8 + bps - 1) / bps
> could be done outside the loop
> and the 2 loops look like they could be done as one loop
> that loop then can be unrolled by a factor of 4 and its inside for the
> uint8_t type case be implemented like:
>     v= lut[get_bits(&gb, 4)];
>     AV_WN32A(dst+b, AV_RN32A(dst+b) | v);
The thing is that type can be both uint8_t and uint32_t. It's a #define
macro which gets the type (uint8_t or uint32_t) passed by.

So not fixed yet because I'm unsure here, if those two lines can be done
with dst being uint32_t also.
> (dst being a uint8_t pointer here, void pointers suck as one cannot add
>  to them)
Yes, I'd be glad if this can be fixed, too.


Best regards,
                   :-) Basty/CDGS (-:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: iff-optimize.patch
Type: text/x-patch
Size: 8594 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100425/9c9fb9d1/attachment.bin>

More information about the ffmpeg-devel mailing list