[FFmpeg-devel] [PATCH] Altivec split-radix FFT

Måns Rullgård mans
Thu Aug 27 00:35:57 CEST 2009


David Conrad <lessen42 at gmail.com> writes:

> On Aug 26, 2009, at 5:28 PM, M?ns Rullg?rd wrote:
>
>> Guillaume POIRIER <poirierg at gmail.com> writes:
>>
>>> Hey,
>>
>>> The previous error I reported was on Linux/PPC64, where your code was
>>> compiling/assembling OK, but not running.
>>> I tried fix that problem on my local OSXLepoard/PPC, and in fact,
>>> your
>>> code doesn't assemble with that toolchain. That can't be good.
>>>
>>> This is the relevant part of your code (from ff_fft_calc_altivec()
>>> function):
>>>
>>> +        "mtctr %0 \n"
>>> +        "stw 2,-4(1) \n"
>>> +        "li 2,16 \n"
>>> +        "bctrl \n"
>>> +        "lwz 2,-4(1) \n" // ABI docs say r2 is general purpose and
>>> caller-saved, but gcc doesn't save it and crashes
>>
>> Which ABI doc was that?  Mine says r2 is "reserved for system use and
>> should not be changed by application code".  Be that as it may, I
>> assume you meant stwu to push the value of r2 and lwz/addi to restore
>> it.  Even with that change, it would fail on ppc64 since you'd be
>> preserving only half the register.  Furthermore, the ABI mandates a
>> 16-byte-aligned stack at all times, but you could probably get away
>> without that if you never call any compiled code.
>
> On OS X at least it's general purpose. http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/100-32-bit_PowerPC_Function_Calling_Conventions/32bitPowerPC.html

I was reading the PPC ELF spec.  I guess they're different.

>>> Which translates into:
>>> L214: mtctr r11
>>> L215: stw 2,-4(1)
>>> L216: li 2,16
>>> L217: bctrl
>>> L218: lwz 2,-4(1)
>>>
>>> The error message is:
>>> libavcodec/ppc/fft_altivec.S:215:Parameter syntax error (parameter 1)
>>> libavcodec/ppc/fft_altivec.S:216:Parameter syntax error (parameter 1)
>>> libavcodec/ppc/fft_altivec.S:218:Parameter syntax error (parameter 1)
>>>
>>> I honestly don't understand what's wrong with what you wrote. OSX
>>> toolchain doesn't like that code neither for PPC32  nor PPC64 target.
>>
>> That's probably the assembler throwing a fit over the reserved r2
>> register.  You could cheat it by writing the instructions with .word
>> directives instead.
>
> Apple's gas only accepts register names, not numbers for asm. So it
> would be stw r2, -4(r1) etc.

And gnu gas only supports numbers.  For extra fun, objdump uses r
names.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list