[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm
Loren Merritt
lorenm
Fri Sep 24 22:50:38 CEST 2010
On Fri, 24 Sep 2010, Ronald S. Bultje wrote:
> Hi,
>
> On Fri, Sep 24, 2010 at 3:30 PM, Alexander Strange
> <astrange at ithinksw.com> wrote:
>> "g" permits registers, so it could generate something like this:
>> ? por %rax(%rcx), %mm1
>>
>> You have to use "m" or MANGLE() in this situation.
>> Also, generating _d_idx(%rax) won't work on BROKEN_RELOCATIONS (x86-64 darwin) because all global references must be _d_idx(%rip).
>>
>> If you have a recent clang try compiling with that, it has a built-in assembler which may have clearer errors.
>> ?and which doesn't build ffmpeg because it doesn't know about 3dnow.
>
> No, no, I don't want registers, I want it to load as a constant, i.e.
> a static number. The register (later on) would be b_idx, so that I
> don't need separate regs for each variable, plus var+b_idx, plus
> var+b_idx+d_idx.
>
> @@ -152,7 +152,7 @@
> "movd (%0), %%mm1 \n"
> "por (%0,%1), %%mm1 \n" // nnz[b] || nnz[bn]
> ::"r"(nnz+b_idx),
> - "r"(d_idx)
> + "g"(d_idx)
> );
> __asm__ volatile(
> "pminub %%mm7, %%mm1 \n"
>
> Also has fun effects:
>
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:341:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:341:expecting
> scale factor of 1, 2, 4, or 8: got `$-8)'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:430:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:430:expecting
> scale factor of 1, 2, 4, or 8: got `$-1)'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:466:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:466:expecting
> scale factor of 1, 2, 4, or 8: got `$-1)'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:502:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:502:expecting
> scale factor of 1, 2, 4, or 8: got `$-1)'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:622:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:622:expecting
> scale factor of 1, 2, 4, or 8: got `$-8)'
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:712:Bad
> Absolute Expression, absolute 0 assumed.
> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccREEh7B.s:712:expecting
> scale factor of 1, 2, 4, or 8: got `$-1)'
>
> ??
>
> Apparently it understands I don't want to take a register for d_idx,
> but it doesn't understand how to make it a number. Yet. Help?
Whereas yasm just adds together everything inside a set of [] to get a
memory arg, gas has special syntax for each component. There is no inline
asm input constraint that would give you something that can be either a
register or a constant and that can be dereferenced to get a memory arg. I
don't know even how to get something that's only a constant and usable in
that context ("i" adds a $ which is required in immediate args but
forbidden in addresses).
The only way to do it is to use "m" and put the entire address in the
input constraint.
"movd %0, %%mm1 \n"
"por %1, %%mm1 \n"
::"m"(nnz[b_idx]),
"m"(nnz[b_idx+d_idx])
);
This isn't strictly correct, in that it doesn't tell gcc that you're
also reading from other adjacent elements of nnz[]. This is rarely a
problem in asm that isn't inlined (because gcc isn't clever enough to
track non-reference through function pointers), but is really annoying to
track down when it does happen. If you did need to fix that, the only ways
I know of are:
"movd %0, %%mm1 \n"
"por %1, %%mm1 \n"
::"m"(*(struct {uint8_t x[4];}*)&nnz[b_idx]),
"m"(*(struct {uint8_t x[4];}*)&nnz[b_idx+d_idx])
);
which is even uglier than usual, or
"movd %0, %%mm1 \n"
"por %1, %%mm1 \n"
::"m"(nnz[b_idx]),
"m"(nnz[b_idx+d_idx])
:"memory"
);
which is overbroad and thus invites pessimization.
--Loren Merritt
More information about the ffmpeg-devel
mailing list