[FFmpeg-devel] [PATCH] VBLE Decoder

Derek Buitenhuis derek.buitenhuis at gmail.com
Wed Nov 9 17:39:43 CET 2011


On 09/11/2011 9:14 AM, Michael Niedermayer wrote:
> please remove trailing whitespace, they arent allowed in ffmpeg nor 
> the fork. 

Woops. Fixed.

>
> the following should be quite a bit faster and prettier:
>
> static av_always_inline int vble_bitscan(GetBitContext *gb)
> {
>      static const uint8_t len[]={ ... };
>      int i= len[ show_bits(gb, 8) ];
>      skip_bits(gb, i+1);
>      return i;
> }

You're right. I discussed this with Kostya as well,
and am rewriting it.

> the sum of lengths should be checked against get_bits_left()

Yup. Thanks.

> reading 4 at a time with get_bits_long() is likely slower than reading 
> 2 at a time of just 1, later also would simplify the code 

I'll look into this.

>
> these should be outside the innermost loop, which should be easy
> to do when 1 value is read at a time but can be done with multiple
> values too, though then messier

I'll check into this. Thanks.

> if(lengths[p]) pix=get_bits(gb, lengths[p])
> (for the 1 sample at a time case)

Given context above.

> in case gcc compiles this to a conditional branch then the
> following should be faster:
> (pix>>  1) ^ -(pix&1);
>
> Note: you can check speed with START/STOP_TIMER

I'll check this, but this seems to be correct.

> this could use  add_hfyu_median_prediction()
> or is there some reason why its use is not possible ?

I had trouble figuring out exactly how to call
add_hfyu_median_prediction() properly. I'll double check it,
as it would likely be much, much faster (I believe it even has ASM).

> this has to be set before get_buffer() as it may affect the kind
> of buffer you get

Thanks. I wasn't aware of this. Fixed.

> you cannot change the linesize of buffers allocated by get_buffer()
> it also would break alignment for outside code like some video filter

Yeah I is mistaken about this. Fixed.

> you should check that the input contains at least 4 + (w*h*3)/16 bytes 

Extra checks can't be a bad thing (usually). Added.

>
> it should be faster to interleave the last get_bits() vble_decode() and
> vble_median_restore() linewise due to better L1 data cache use
> alternatively the 3 median_restore could be interleaved linewise with
> vble_decode() and draw_horiz_band() be implemented
> (CODEC_CAP_DRAW_HORIZ_BAND)
>
> also you could add CODEC_FLAG_GRAY support to allow faster but
> grayscale only decoding.
>
> These are all of course just ideas, its perfectly fine if you dont
> implement them or do them as seperate patches.

Interesting ideas nonetheless. I'll try these out. Though, not
terribly sure what a decent use-case for VBLE and CODEC_FLAG_GRAY
is. ;)

> the reading and use can be combined making this array unneeded
> the casts are also unneeded

Yeah, this was leftover from an earlier implementation I had written.

> If you want you can also add yourself to the MAINTAINERS file,
> that is if you want to maintain this code in/for ffmpeg. And it would
> be very welcome if you maintain it!

Sure.

> [...]

Thanks for the feedback!

- Derek


More information about the ffmpeg-devel mailing list