[FFmpeg-devel] [PATCH] h264 CAVLC coeff_token decoder based on CLZ
Sat Jan 23 20:03:40 CET 2010
On Sat, Jan 23, 2010 at 10:18 AM, Michael Niedermayer <michaelni at gmx.at>wrote:
> On Sat, Jan 23, 2010 at 03:28:53AM +0300, Anatoliy Nenashev wrote:
> > Hi all!
> > I have made some investigations in H264 CAVLC coeff_token decoder.
> > In attached patch you can see special implementation of VLC decoder for
> > coeff_token which is based on CLZ (count leading zeros).
> > This method reduce size of VLC decoding tables for coeff_token from
> > (520+332+280+256)*2 = 2776 byte to (2*4*16 + 64 + 67 + 63 + 63) = 385
FWIW: these table are not called that often, and the code you search for is
often at the beginning of the table if you sort it by increasing code
My point, FWIW: try a simple linear search on the sorted table.
Extra goodies to try:
- put a sentinel at the end
- left-justify the code to search for on a 16-bits basis
so something like:
uint bits16 = show_ubits(re, gb, 16);
table = vlc_blah;
while (table->code16_left_justified > bits16) table++;
skal (just thinking out loud)
> The all reason is to reduce cache missing. Unfortunally, on my system
> > (Intel Core 2 Duo P8700 Debian Linux 64-bit) it make no difference with
> > original code performance. May be the bottleneck is implementation of
> > It would be interesting to see any tests on other systems. Any comments
> > welcome. If community will agree with this kind of optimization I will
> > continue this work to implement total_zeros and run_before decoding.
> Its great to see some other people work on h264 optimizations :)
> but it has to be faster to be interresting :(
> i dont think you will be able to beat the generic VLC code easily
> but there are many other areas that one could try to speed up.
> Theres lots of code and lots of it matters speed wise ...
> just put START/STOP_TIMER around something and change it and see if its
> better or not START/STOP_TIMER will also tell you immedeatly how often it
> is executed so you know if the code you picked is worth further tries
> or if its too rarely executed
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> Asymptotically faster algorithms should always be preferred if you have
> asymptotical amounts of data
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> -----END PGP SIGNATURE-----
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
More information about the ffmpeg-devel