[FFmpeg-devel] [PATCH] Fix for DECLARE_ALIGNED_8 macro on ARM (also prevent gcc 3.4.x ICE)
Sat Feb 16 15:39:19 CET 2008
M?ns Rullg?rd <mans at mansr.com> writes:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>> Hello All,
>> Currently DECLARE_ALIGNED_8 macro is quite misleading on ARM (it tries to
>> enforce 4 byte alignment contarary to what a developer reading the
>> ffmpeg sources could naturally expect):
>> /* This is to use 4 bytes read to the IDCT pointers for some 'zero'
>> line optimizations */
>> #define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(4, t, v)
>> #define STRIDE_ALIGN 4
>> And it really causes problems with gcc 3.4.x:
>> The offending line is the following:
>> DECLARE_ALIGNED_8 (uint64_t, aligned_bak[stride]);
>> Apparently gcc got confused when asked to align uint64_t array at 4 byte
>> boundary and died with ICE. Earlier I just thought 'WTF?' and removed
>> DECLARE_ALIGNED_8 around that array (gcc should be clever enough to align
>> that array properly without any help anyway). Now it is clear that
>> DECLARE_ALIGNED_8 macro definition triggered this problem.
>> Modern ARM cores have instructions for reading/writing 64-bit data
>> and require strict 8 byte alignment for using them. According to ARM
>> manual, using LDRD instruction with the data that is not aligned at
>> 8 byte boundary is UNPREDICTABLE and IMPLEMENTATION
>> DEFINED. Unfortunately, both ARM cores that I have available (ARM9E
>> and ARM11) require just 4 byte alignment for these instructions
>> (ARM11 is faster when used with 8 byte aligned data though). So if
>> there were any other problems related to the use of this macro, I
>> could not observe them in my tests.
>> Anyway, I think it is better to fix this macro even though the
>> immediate visible effect is only the "out of the box" support for
>> legacy compiler (gcc 3.4.x).
> I agree. Since some ARM variants might require strict alignment for
> 64-bit loads, we must provide this.
>> The comment in the code was removed by patch as I don't see direct
>> relation of this macro and IDCT, 16 byte alignment is supposedly
>> guaranteed for IDCT according to the comments from dsputil.h, anyway
>> feel free to prove me wrong:
>> void (*idct)(DCTELEM *block/* align 16*/);
>> Index: libavcodec/dsputil.h
>> --- libavcodec/dsputil.h (revision 12120)
>> +++ libavcodec/dsputil.h (working copy)
>> @@ -535,10 +535,8 @@
>> #elif defined(ARCH_ARMV4L)
>> -/* This is to use 4 bytes read to the IDCT pointers for some 'zero'
>> - line optimizations */
>> -#define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(4, t, v)
>> -#define STRIDE_ALIGN 4
>> +#define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(8, t, v)
>> +#define STRIDE_ALIGN 8
> Looks good to me.
Scratch that; that section of the file is a mess. I propose the
attached patch to clean it up instead.
mans at mansr.com
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 1802 bytes
Desc: not available
More information about the ffmpeg-devel