[FFmpeg-devel] Pipeline: H.264 speed improvements
Sun Dec 28 12:14:36 CET 2008
Michael Niedermayer wrote:
> On Tue, Dec 23, 2008 at 08:41:00PM +0000, M?ns Rullg?rd wrote:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>>> On Tue, Dec 23, 2008 at 04:08:26AM -0500, Jason Garrett-Glaser wrote:
>>>> I've put together a list of all the possible speed improvements I can
>>>> see, including both some obvious ones and non-obvious ones. If you're
>>>> interested in implementing anything here, say so to make sure your
>>>> work isn't duplicated by Michael or I. Also feel free to discuss some
>>>> of the more nutty ideas, like the VLC table, or tell me that I'm wrong
>>>> about something.
>>>> Non-assembly stuff:
>>>> av_log2 is unnecessarily powerful for use in h264.c. All signed
>>>> golomb values in H.264 fit in 16-bit, and all unsigned golomb values
>>>> other than headers fit in 8-bit. Thus all ordinary unsigned golomb
>>>> code reads can literally be put in a 256-byte VLC table and replaced
>>>> with a single array lookup.
>>> it may be that all ue golomb coded values are <256 outside the headers,
>>> though even this seems wrong for mb_skip_run the way i understand the spec.
>>> But a value of 255 corresponds to a 15bit long vlc code.
>>> a 256 (or 128) entry LUT limits one to values 0-15 512 (or 1024) to 0-31
>>> Now there are surely a few left that are that small but thats far from
>>> all non header values.
>> av_log2() can be trivially implemented on most CPUs using a count
>> leading zeros instruction. That should be even faster than a table.
>> On ARM this instruction takes one cycle.
> patch & benchmark are welcome, but note, i dont think av_log2() is
> used much in h264.c
Please benchmark also ALAC decoding, it seems av_log2() is much more
speed critical there.
More information about the ffmpeg-devel