[FFmpeg-devel] [PATCH] VP8 MMX optimizations (MC and IDCT dc_add)
Ronald S. Bultje
Fri Jun 25 18:09:06 CEST 2010
On Wed, Jun 23, 2010 at 3:38 AM, Jason Garrett-Glaser
<darkshikari at gmail.com> wrote:
> There are some... issues... with the current code that prevent commit.
> ?I will be bringing these up with Ronald soon ;)
The main issue was VLA because I didn't distinguish
srcstride/deststride where relevant, leading to huge stack
allocations. Attached patch fixes that in a manner similar to
h264dsp_mmx.c. Still only MMX/EXT here, if this is OK I'll apply it
and work on adapting and integrating Jason's SSE2/SSSE3 work, too (or
he can do that himself).
The other issue Jason brought up is the fact that splitmv (4x4
subblocks of 4x4 pixels each in a 16x16 macroblock) is handled as
actually 16 4x4 blocks, whereas usually several of them have the same
MV (4x8, 8x4, 8x8, etc.). I intend to address this, but in a separate
patch because the issue is unrelated to the VLA one, touches different
code and doesn't affect the optimizations themselves (it'll just make
heavier use of the SSE2/SSSE3 functions once done correctly).
Please comment, I'd love to apply these patches to get my tree a
little more aligned with upstream.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 23342 bytes
Desc: not available
More information about the ffmpeg-devel