[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm

Daniel Verkamp daniel
Fri Sep 24 18:26:49 CEST 2010


On Fri, Sep 24, 2010 at 9:04 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> Hi,
>
> On Thu, Sep 23, 2010 at 6:13 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> $subj. This could likely be done in inline asm as well but I still
> [..]
>
> Attached patch #1:
> 1) unrolls loop (allows inlining of a lot more, saves registers/stack
> after: 976 dezicycles in lf-strength, 4194102 runs, 202 skips
> before: 1164 dezicycles in lf-strength, 4194083 runs, 221 skips
>
> (the yasm version was ~86 cycles, which I hope to eventually reach by
> eliminating the pand and the mask_dir variable, merging edge and
> b_idx, etc.)
>
> So removing pand (which doesn't do anything in the one case, and can
> be replaced by a pxor in the other). With the attached patch #2, I get
> this:
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:315:bad
> register name `%%mm0'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:520:bad
> register name `%%mm0'
>
> What does that mean?

If you omit all of the optional colon-separated arguments to asm, the
% symbols before register names in the asm no longer need to be
escaped with a second % (I suppose since there can be no substitution
when there are no operand constraints).  You can add an empty : or
just drop the doubled % to avoid this.

-- Daniel



More information about the ffmpeg-devel mailing list