[FFmpeg-devel] [PATCH] Altivec split-radix FFT
Wed Aug 26 23:28:35 CEST 2009
Guillaume POIRIER <poirierg at gmail.com> writes:
> On Wed, Aug 26, 2009 at 12:07 AM, Guillaume POIRIER<poirierg at gmail.com> wrote:
> There's really little information about inline asm PPC64 programming.
I've found gcc source code to be the best documentation for inline asm.
> So far, it looks like PPC64 is exactly the same thing as PPC32, except
> that it allows 64-bits pointers.
Registers are 64-bit instead of 32-bit which makes a hell of a
difference if you use the full width.
> The previous error I reported was on Linux/PPC64, where your code was
> compiling/assembling OK, but not running.
> I tried fix that problem on my local OSXLepoard/PPC, and in fact, your
> code doesn't assemble with that toolchain. That can't be good.
> This is the relevant part of your code (from ff_fft_calc_altivec() function):
> + "mtctr %0 \n"
> + "stw 2,-4(1) \n"
> + "li 2,16 \n"
> + "bctrl \n"
> + "lwz 2,-4(1) \n" // ABI docs say r2 is general purpose and
> caller-saved, but gcc doesn't save it and crashes
Which ABI doc was that? Mine says r2 is "reserved for system use and
should not be changed by application code". Be that as it may, I
assume you meant stwu to push the value of r2 and lwz/addi to restore
it. Even with that change, it would fail on ppc64 since you'd be
preserving only half the register. Furthermore, the ABI mandates a
16-byte-aligned stack at all times, but you could probably get away
without that if you never call any compiled code.
> Which translates into:
> L214: mtctr r11
> L215: stw 2,-4(1)
> L216: li 2,16
> L217: bctrl
> L218: lwz 2,-4(1)
> The error message is:
> libavcodec/ppc/fft_altivec.S:215:Parameter syntax error (parameter 1)
> libavcodec/ppc/fft_altivec.S:216:Parameter syntax error (parameter 1)
> libavcodec/ppc/fft_altivec.S:218:Parameter syntax error (parameter 1)
> I honestly don't understand what's wrong with what you wrote. OSX
> toolchain doesn't like that code neither for PPC32 nor PPC64 target.
That's probably the assembler throwing a fit over the reserved r2
register. You could cheat it by writing the instructions with .word
mans at mansr.com
More information about the ffmpeg-devel