[Ffmpeg-devel] VP3/Theora Perfection

Thu May 19 15:19:10 CEST 2005

Michael Niedermayer wrote:

> Hi
> 
> On Thursday 19 May 2005 13:04, The Wanderer wrote:

<snip>
>> Here it looks to me like the "original" version spends about 5.5%
>> *less* time in unpack_dct_coeffs than the "new" one does (that's 1
>> - (4151489 / 43992551) ~= .056), i.e., the "new" one is slower.
>> Like the above, this is exactly the reverse of what you're saying;
>> is my brain just totally screwed up here, or is something else
>> going on?
> 
> hmm, ill try explain it again, no doubt my first try is a little
> strangely written
> 
> my dev tree got slower after a cvs up, i dont have the benchmark
> scores any more, and my dev tree changed since then so i cant rerun
> it easily
> 
> testing the r1.57 -> r1.58 change on a clean tree shows that the new
> version is faster (see my first benchmark) but if -finline-limit=2000
> is added then r1.57 is faster (see my second benchmark), its also
> faster then r1.58 without -finline-limit=2000
> 
> the new code is also significantly smaller then the old as it
> replaces a "large" switch with look up tables
> 
> from all that evidence i conclude that gcc didnt inline
> unpack_token() or unpack_vlcs() orginally and that the speed increase
> seen on a clean tree is not because the function is really faster but
> because it is inlined

I think I understand what happened now. Thanks for taking the time to
explain it again.

(I'd really like to understand the code and its underlying concepts well
enough to be able to help on this kind of work myself, or at least make
sense of what I read without bothering people with questions... but
nowadays it seems I can't even implement a simple dictionary finder
without intractable bugs cropping up on me.)

-- 
       The Wanderer

Warning: Simply because I argue an issue does not mean I agree with any
side of it.

A government exists to serve its citizens, not to control them.