# [FFmpeg-devel] [PATCH][RFC] Lagarith Decoder.

Nathan Caldwell saintdev
Fri Aug 14 21:53:36 CEST 2009

```On Wed, Aug 12, 2009 at 7:54 AM, Reimar
D?ffinger<Reimar.Doeffinger at gmx.de> wrote:
> On Wed, Aug 12, 2009 at 02:12:55PM +0200, Michael Niedermayer wrote:
>> On Mon, Aug 10, 2009 at 11:42:19PM -0600, Nathan Caldwell wrote:
>> > On Sat, Aug 8, 2009 at 6:32 AM, Michael Niedermayer<michaelni at gmx.at> wrote:
>> > >> +/* Fast round up to least power of 2 >= to x */
>> > >> +static inline uint32_t clp2(uint32_t x)
>> > >> +{
>> > >> + ? ?x--;
>> > >> + ? ?x |= (x >> 1);
>> > >> + ? ?x |= (x >> 2);
>> > >> + ? ?x |= (x >> 4);
>> > >> + ? ?x |= (x >> 8);
>> > >> + ? ?x |= (x >> 16);
>> > >> + ? ?return x+1;
>> > >> +}
>> > >
>> > > is 1<<av_log2(x) faster?
>> >
>> > Might be, but it gives different results, so it's a moot point.
>>
>> 2<<av_log2(x-1)
>> or whatever
>
> Well, that all depends on what input range is needed.
> E.g. for 0 the documentation does not match the behaviour
> for the original function (returns 0 which is not even a
> power of 2).
> In the worst case, you'd have to do
> return x > 1 ? 2 << av_log(x - 1) : x;
> I think, which has a small but still existing chance of
> being faster.

Well, that went OT rather quickly, lol.
0 input doesn't really matter. If we have a cumulative probability of
0, then that means all probabilities are 0 and we have larger problems
than nearest power of 2 being incorrect.
Anyway, for my tests cpl2 was faster than av_log2 by quite a large
margin ~2000 dezicycles for av_log2 vs. ~400 dezicycles for cpl2
tested on both Core2 and lolAtom and got the same results). However
this is only run once per plane, and av_log2 looks cleaner, so I'll