[FFmpeg-devel] [patch][OpenHEVC]added ASM functions for epel + qpel

Pierre Edouard Lepere Pierre-Edouard.Lepere at insa-rennes.fr
Thu Mar 6 16:40:04 CET 2014


new patch, now all in a single, smaller file !

>> +    sub             srcq, 1
>Why? Just subtract one from src when you dereference from it [srcq-1]
>instead of [srcq]).

because it's more convenient, having filters start at src whether we are in h, v or hv.

> +    EPEL_FILTER        8, mx
> +
> +    LOOP_INIT  epel_h_h_2_8

Labels with a dot are local and thus don't need full function name scope,
just loop instead of epel_h_h_2_8 is fine.

> +    EPEL_LOAD         8, src, 1
> +    EPEL_COMPUTE       8, 2
> +    PEL_STORE2       dst, m0, m1
> +    LOOP_END   epel_h_h_2_8, dst, dststride, src, srcstride
> +    RET

OK, so the actual code. For play, can you show the _actual disassembly_
that all these macros eventually got us to? I wonder what it actually gives.

>I can understand the pmaddwd approach for second pass may be faster for
>half-registers, since you fill the register up to full width and save one
>instruction - but did you measure it?
>Then, for second, you're just spending instructions shuffling. I don't
>think 2a is faster than 2b, in fact I expect it to be significantly slower.

This was done first with intrinsics, and pmulhw was needed, so it adds just too much instructions.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-added-ASM-functions-for-HEVC.patch
Type: text/x-patch
Size: 76675 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140306/6993b473/attachment.bin>

More information about the ffmpeg-devel mailing list