[FFmpeg-devel] [PATCH] faster vp6 decoding

Zuxy Meng zuxy.meng
Thu Feb 12 14:03:15 CET 2009


2009/2/12 Aurelien Jacobs <aurel at gnuage.org>:
> Zuxy Meng wrote:
>
>> Hi,
>>
>> 2009/2/9 Jason Garrett-Glaser <darkshikari at gmail.com>:
>> > +    "punpcklbw %%mm7, %%mm0\n\t"                                \
>> > +    "punpcklbw %%mm7, %%mm1\n\t"                                \
>> > +    "punpckhbw %%mm7, %%mm3\n\t"                                \
>> > +    "punpckhbw %%mm7, %%mm4\n\t"                                \
>> > +    "pmullw  0(%2), %%mm0\n\t" /* src[x-8 ] * biweight [0] */   \
>> > +    "pmullw  8(%2), %%mm1\n\t" /* src[x   ] * biweight [1] */   \
>> > +    "pmullw  0(%2), %%mm3\n\t" /* src[x-8 ] * biweight [0] */   \
>> > +    "pmullw  8(%2), %%mm4\n\t" /* src[x   ] * biweight [1] */   \
>> > +    "paddw %%mm1, %%mm0\n\t"                                    \
>> > +    "paddw %%mm4, %%mm3\n\t"                                    \
>> >
>> > This can be done faster with pmaddubsw (SSSE3-only, but worth making
>> > another version surely).
>>
>> Sure but that would require weights to be stored as arrays of int8_t
>> instead of int16_t?
>>
>> > Worthwhile if you make an SSE version.
>>
>> SSE2?
>>
>> > Works by interleaving the weights, allowing you to avoid the unpacks,
>> > use only two multiplies, and avoid the adds, too, I think.  If I'm
>> > right, that makes the entire thing quite a bit less than half the
>> > instructions.
>>
>> I tried something like below and it's about 15% faster on my Pentium
>> M. The speed up should be more prominent on modern CPUs with 128 bit
>> FADD unit:
>>
>> [...some asm code...]
>
> Nice. I fixed it so that it works on x86_64 and I cleaned it up.
> It works but has some small visible artifacts.
> It would be great if you could fix attached patch so that it gives
> bitexact result with:
>  ffmpeg -i sample.flv -f framecrc out.crc

Can be fixed by expand ff_pw_64 from uint64_t to xmm_reg.

-- 
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6




More information about the ffmpeg-devel mailing list