[FFmpeg-devel] [PATCH/RFC] ALT_BITSTREAMREADER optimizations?for the platforms with strict alignment

Siarhei Siamashka siarhei.siamashka
Wed Dec 10 00:54:37 CET 2008


On Tuesday 09 December 2008, Michael Niedermayer wrote:
> On Tue, Dec 09, 2008 at 12:17:03AM +0200, Siarhei Siamashka wrote:
> > On Monday 08 December 2008, Michael Niedermayer wrote:
> > > On Mon, Dec 08, 2008 at 12:12:41PM +0200, Siarhei Siamashka wrote:
>
> [...]
>
> > > > Index: libavcodec/svq1dec.c
> > > > ===================================================================
> > > > --- libavcodec/svq1dec.c	(revision 16028)
> > > > +++ libavcodec/svq1dec.c	(working copy)
> > > > @@ -195,7 +195,7 @@
> > > >
> > > >  #define SVQ1_CALC_CODEBOOK_ENTRIES(cbook)\
> > > >        codebook = (const uint32_t *) cbook[level];\
> > > > -      bit_cache = get_bits (bitbuf, 4*stages);\
> > > > +      bit_cache = get_bits_long (bitbuf, 4*stages);\
> > > >        /* calculate codebook entries for this vector */\
> > > >        for (j=0; j < stages; j++) {\
> > > >          entries[j] = (((bit_cache >> (4*(stages - j - 1))) & 0xF) +
> > > > 16*j) << (level + 1);\
> > >
> > > this code (not only the patch) looks like it needs some work, and such
> > > work (cleanup) would likely make the 17bit problem go away as a side
> > > effect.
> >
> > Is anybody going to do this cleanup in the near future? I just wke onder
> > whether you are suggesting me to do this work or whatever.
>
> at the risk of a flamewar with you, yes i suggest you do clean these 4
> lines up instead of making the code alot slower.

ok, thanks, that's the first thing I wanted to clarify.

What do you think would be a proper cleanup in this case? I'm sorry, I still
can't always understand you correctly with just vague hints and this is what
may become a source of a flamewar.

This code can be changed either for better readability or for
better performance.

In the former case, just a simple loop doing calls to 'get_bits(bitbuf, 4)'
would be a solution.

In the latter case, we want to have as little UPDATE_CACHE operations as
possible. As the maximum value for 'stages' is 6, we don't need to read more
than 24 bits and a single UPDATE_CACHE operation is enough for the case when
MIN_CACHE_BITS is 25 or 32. Original code apparently aimed at minimizing
bitstream reader related overhead and tried to get '4*stages' bits at once
(and I can't say that I have something against this solution). Then the code
proceeds with processing this data in the icky looking loop (this is what
could be simplified for sure). The only real problem with this code is that
it does not take into account bitstream readers with MIN_CACHE_BITS set to 17.

Currently bitstream reader provides functions:
'get_bit1' ensures 1 bit of data
'get_bits_long' ensures up to 32 bits of data
'get_bits' is specified to guarantee only up to 17 bits, but in reality is 
implemented to provide 25 on all the platforms with unmodified ffmpeg
sources.

I suggest introducing a complete selection of functions: read 1 bit, read up
to 9 bits, read up to 17 bits, read up to 25 bits, read up to 32 bits. And
codec implementors could use appropriate ones according to their requirements.
Bitstream reader implementation can take advantage of having less bits to
provide in some cases. For 32-bit x86 platform with ALT_BITSTREAM_READER,
these special fast cases are 1 and 25 bits (implemented with reading int8_t or
int32_t). Little endian platforms with strict alignment would benefit from all
of these functions (the versions with less bits guaranteed result in faster
and more compact the code). 64-bit x86 systems could probably get a fast 32
bit variant as well.

As for this get_bits call in SVQ1_CALC_CODEBOOK_ENTRIES macro, the implementor
probably really wants to use the variant, which provides up to 25 bits.

> You dont have to of course but iam sure you will unerstand that i wont
> accept a workaround instead of a proper fix.
> And i doubt you would if you where making the decission and someone
> else did propose it.

Sorry, I have probably written too much. Anyway, to make a short summary:
please tell me what kind of modifications to this code you would like to
see and I'll do it.

-- 
Best regards,
Siarhei Siamashka




More information about the ffmpeg-devel mailing list