[FFmpeg-devel] [PATCH] ARM: remove useless stack push/pop
Wed Jun 9 10:07:04 CEST 2010
Rafa?l Carr? <rafael.carre at gmail.com> writes:
> On Wed, 09 Jun 2010 00:43:54 +0100
> M?ns Rullg?rd <mans at mansr.com> wrote:
>> Rafa?l Carr? <rafael.carre at gmail.com> writes:
>> > While I'm here, did anyone try to build FFmpeg with -mthumb yet ?
>> Yes, gcc generated invalid asm.
> Do you have more information about this ?
> It would be useful if the same thing happens to me and I know what to
> look for
It fails with some obscure error in libavformat/utils.c iirc. I think
it was a load or a branch with invalid offset.
>> > "grep -Er '(pop|ldm).*pc' libavcodec/arm" shows that there is a lot
>> > of functions which can't be called from thumb on armv4t : using
>> > ldm ...,pc will not perform the switch from arm to thumb on these
>> > CPU.
>> So use interworking if you need to. Any decent linker support that.
> interworking modifies function calls by inserting veneers but returning
> from ASM code while setting the T bit correctly is your responsibility
> Also "The AAPCS requires that all sub-routine call and return sequences
> support inter-working between ARM and Thumb states." but it doesn't
> really makes sense in internal functions so unless you export assembly
> functions you're good
Yeah, you're right. I still don't think there's much point to it.
ARMv4 CPUs are too slow anyway.
>> See also a blog post I did some time ago on the topic. Perhaps I
>> should revisit that.
> I found http://hardwarebug.org/2009/03/25/thumbs-up/ which looks
> promising, although it's about thumb-2, not thumb
>> BTW, many, if not most, Cortex-A8 chips in the field have hardware
>> bugs rendering any mixing of Thumb and ARM code unreliable. Older
>> cores work, but most of those are pre-Thumb2 and the speed penalty
>> there is too great for FFmpeg.
> I wouldn't know, i only have really old CPUs ;)
>> > diff --git a/libavcodec/arm/jrevdct_arm.S
>> Does this function call any other functions? If so, the stack must
>> maintain 8-byte alignment, and this is the easiest way to accomplish
>> that. Not that you'd want to use that DCT implementation anyway.
> No it doesn't (stack pointer is modified a few lines below entry anyway)
Does anything in the function require 8-byte alignment? LDRD or so.
mans at mansr.com
More information about the ffmpeg-devel