[FFmpeg-devel] [PATCH] Altivec split-radix FFT
Thu Aug 27 00:20:22 CEST 2009
On Aug 26, 2009, at 5:28 PM, M?ns Rullg?rd wrote:
> Guillaume POIRIER <poirierg at gmail.com> writes:
>> The previous error I reported was on Linux/PPC64, where your code was
>> compiling/assembling OK, but not running.
>> I tried fix that problem on my local OSXLepoard/PPC, and in fact,
>> code doesn't assemble with that toolchain. That can't be good.
>> This is the relevant part of your code (from ff_fft_calc_altivec()
>> + "mtctr %0 \n"
>> + "stw 2,-4(1) \n"
>> + "li 2,16 \n"
>> + "bctrl \n"
>> + "lwz 2,-4(1) \n" // ABI docs say r2 is general purpose and
>> caller-saved, but gcc doesn't save it and crashes
> Which ABI doc was that? Mine says r2 is "reserved for system use and
> should not be changed by application code". Be that as it may, I
> assume you meant stwu to push the value of r2 and lwz/addi to restore
> it. Even with that change, it would fail on ppc64 since you'd be
> preserving only half the register. Furthermore, the ABI mandates a
> 16-byte-aligned stack at all times, but you could probably get away
> without that if you never call any compiled code.
On OS X at least it's general purpose. http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/100-32-bit_PowerPC_Function_Calling_Conventions/32bitPowerPC.html
>> Which translates into:
>> L214: mtctr r11
>> L215: stw 2,-4(1)
>> L216: li 2,16
>> L217: bctrl
>> L218: lwz 2,-4(1)
>> The error message is:
>> libavcodec/ppc/fft_altivec.S:215:Parameter syntax error (parameter 1)
>> libavcodec/ppc/fft_altivec.S:216:Parameter syntax error (parameter 1)
>> libavcodec/ppc/fft_altivec.S:218:Parameter syntax error (parameter 1)
>> I honestly don't understand what's wrong with what you wrote. OSX
>> toolchain doesn't like that code neither for PPC32 nor PPC64 target.
> That's probably the assembler throwing a fit over the reserved r2
> register. You could cheat it by writing the instructions with .word
> directives instead.
Apple's gas only accepts register names, not numbers for asm. So it
would be stw r2, -4(r1) etc.
Apple's gas also doesn't support the macro syntax used in the .S,
doesn't support .ifb, and uses .const_data instead of .rodata.
More information about the ffmpeg-devel