[FFmpeg-devel] VP8 decoder optimization status
Tue Jun 29 04:09:04 CEST 2010
Here's a rough guide to what's done and what needs to be done before
ffmpeg's VP8 decoder is as fast as a politician running away from an
6-tap motion compensation
bilinear motion compensation
luma dc WHT
i16x16 intra pred
i4x4 intra pred (V, DC, TM)
regular iDCT (patch by Ronald is on ML)
i4x4 intra pred (DDL, DDR, VR, HD, VL, HU)
ARM/PPC asm: nothing done yet
Fully convert vp5/6/7/8 arithmetic coder to bytestream: eliminate the
Port all of x264's and ffh264's optimizations once the above is done
(since they'll now be relevant).
Convert vp5/6/7/8 arithmetic coder to use a larger cache size (maybe
16-bit or 32-bit?) for fewer bytestream reads.
Optimize decode_block_coeffs (it can surely be made faster).
Improve edge emulation handling (we currently have the worst of both
worlds -- we require padding on the edges, yet we use the slow
ff_emulated_edge_mc -- we should pick one method or the other).
Optimize cache handling (mvs and nnz).
Optimize MV prediction.
Probably lots of other stuff I haven't thought of, feel free to
The current top priority for x86 speed is by far and away the Normal
loopfilter -- it's something like 60-70%+ of the total time, since
we've SIMD-optimized nearly everything else of note.
More information about the ffmpeg-devel