Ticket #272 (closed defect: fixed)
altivec specific code segfaults on ppc when compiled with --enable-pic
| Reported by: | kim | Owned by: | |
|---|---|---|---|
| Priority: | normal | Component: | avcodec |
| Version: | git-master | Keywords: | ppc, PIC |
| Cc: | Blocked By: | ||
| Blocking: | Reproduced by developer: | yes | |
| Analyzed by developer: | no |
Description
When compiled with --enable-pic, ffplay segfault when using code in libavcodec/ppc/fft_altivec_s.S on various files. For instance when decoding aac or ac3 audio streams. This is always reproducible and do not happen when disabling pic (the default):
Tested with latest git (47d2ca3205b53665328fe301879c339449db7a1d) but this has been the case for a long time I believe, I started remarking this when upgrading from mythtv 0.23 to 0.24, several months ago.
from master:
./configure --enable-pic
make
gdb ./ffplay_g ~myth/aac.dump
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x493b5460 (LWP 32531)]
0x10361e3c in ff_fft_calc_altivec () at libavcodec/ppc/fft_altivec_s.S:447
447 DECL_FFTS 0
Full gdb output attached (with bt, disass and register dump) attached.
I believe the cause of the crash to be the loading of a global symbol in a non pic compatible way.
Just doing ./configure and make, the file plays normally.
Attachments
Change History
comment:1 Changed 2 years ago by cehoyos
- Status changed from new to open
- Version changed from unspecified to git-master
If you believe the crash did not happen in the past, please find the version introducing it.
comment:2 Changed 2 years ago by kim
Hi again,
the offending commit is this one (hurray for git bissect):
commit a46b84d1204d3cd2de14f08de29afee08f8f86d0
Author: Måns Rullgård <mans@…>
Date: Sun Jul 4 18:33:47 2010 +0000
PPC: convert Altivec FFT to pure assembler
On PPC a leaf function has a 288-byte red zone below the stack pointer,
sparing these functions the chore of setting up a full stack frame.
When a function call is disguised within an inline asm block, the
compiler might not adjust the stack pointer as required before a
function call, resulting in the red zone being clobbered.
Moving the entire function to pure asm avoids this problem and also
results in somewhat better code.
Originally committed as revision 24044 to svn://svn.ffmpeg.org/ffmpeg/trunk
Reverting this commit on top of master (which is f9ecb849ef39bc337d9439b829fe08da5c95cc3d at the moment) fixes the issue for me. I guess it went unnoticed for so long because
1) ppc is pretty rare these days
2) other software using ffmpeg as a shared library (thus using -fPIC) used to embbed it and link statically until fairly recently, or just shipped older versions.
(From the commit message though, reverting might introduce some other bugs that the commit fixed).
comment:3 Changed 2 years ago by cehoyos
I compiled current FFmpeg on a G4 with --enable-pic (made sure about #define HAVE_ALTIVEC 1), tested two files containing aac audio, made sure ff_fft_calc_altivec() was called, but cannot reproduce a crash here.
(Your output is incomplete, unfortunately.)
comment:4 Changed 2 years ago by kim
I have confirmed this with a stream from my internet/tv box. The sample is available here:
http://idea.nguyen.vg/~kim/aac2.dump
(few seconds of a french commercial). Video is H264, Audio is AAC. When compiled with --enable-pic
with commit a46b84d1204 reverted, ffplay plays audio and video fine. With plain master
(f9ecb849ef3) it segfaults as explained above. The stream also plays fine on x86 vlc (just to check that it's not too corrupt).
Here are more infos and I'm attaching a gdb trace with full disassembly of the function (instead of pc-32, pc+32).
Machine is a G4 MacMini? :
/proc/cpuinfo:
processor : 0
cpu : 7447A, altivec supported
clock : 1416.666661MHz
revision : 1.2 (pvr 8003 0102)
bogomips : 83.24
timebase : 41620997
platform : PowerMac?
model : PowerMac10,1
machine : PowerMac10,1
motherboard : PowerMac10,1 MacRISC3 Power Macintosh
detected as : 287 (Mac mini)
pmac flags : 00000010
L2 cache : 512K unified
pmac-generation : NewWorld?
Memory : 1024 MB
gcc version:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/powerpc-linux-gnu/gcc/powerpc-linux-gnu/4.5.2/lto-wrapper
Target: powerpc-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro? 4.5.2-8ubuntu4' --with-bugurl= file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=powerpc-linux-gnu --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib/powerpc-linux-gnu --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib/powerpc-linux-gnu --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --enable-secureplt --disable-softfloat --enable-targets=powerpc-linux,powerpc64-linux --with-cpu=default32 --disable-werror --with-long-double-128 --enable-checking=release --build=powerpc-linux-gnu --host=powerpc-linux-gnu --target=powerpc-linux-gnu
Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro? 4.5.2-8ubuntu4)
as version:
GNU assembler version 2.21.0 (powerpc-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.21.0.20110327
distrib is ubuntu natty, ppc (libc6 version is 2.13 egcs)
Changed 2 years ago by kim
-
attachment
full_gdb.txt
added
GDB trace for the file aac2.dump with full disassembly
comment:5 Changed 2 years ago by cehoyos
Since I am unable to reproduce the crash with the sample on a G4 after compiling with --enable-pic (and with altivec enabled):
Could you please provide complete, uncut output of ffmpeg when crashing?
(If ffmpeg does not crash, but ffplay does, please mention it and provide complete, uncut output of ffplay.)
$ ffmpeg -i aac2.dump -f null -
ffmpeg version git-N-30662-gf9ecb84, Copyright (c) 2000-2011 the FFmpeg developers
built on Jun 9 2011 09:58:10 with gcc 4.0.1 (Apple Inc. build 5493)
configuration: --enable-pic
libavutil 51. 8. 0 / 51. 8. 0
libavcodec 53. 6. 1 / 53. 6. 1
libavformat 53. 2. 0 / 53. 2. 0
libavdevice 53. 1. 1 / 53. 1. 1
libavfilter 2. 14. 1 / 2. 14. 1
libswscale 0. 14. 1 / 0. 14. 1
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[aac @ 0x1012400] channel element 2.10 is not allocated
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[h264 @ 0x1010800] non-existing PPS referenced
[h264 @ 0x1010800] non-existing PPS 0 referenced
[h264 @ 0x1010800] decode_slice_header error
[h264 @ 0x1010800] no frame!
[mpegts @ 0x1005400] max_analyze_duration 5000000 reached at 5034667
Input #0, mpegts, from 'aac2.dump':
Duration: 00:00:10.24, start: 3033.232422, bitrate: 1791 kb/s
Program 1352
Stream #0.0[0x23](fra): Subtitle: [6][0][0][0] / 0x0006
Stream #0.1[0xa3]: Video: h264 (High), yuv420p, 480x576 [PAR 32:15 DAR 16:9], 29.06 fps, 25 tbr, 90k tbn, 50 tbc
Stream #0.2[0x778](fra): Audio: aac, 48000 Hz, stereo, s16, 73 kb/s
[buffer @ 0xc0e110] w:480 h:576 pixfmt:yuv420p tb:1/1000000 sar:32/15 sws_param:
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf53.2.0
Stream #0.0: Video: rawvideo, yuv420p, 480x576 [PAR 32:15 DAR 16:9], q=2-31, 200 kb/s, 90k tbn, 25 tbc
Stream #0.1(fra): Audio: pcm_s16be, 48000 Hz, stereo, s16, 1536 kb/s
Stream mapping:
Stream #0.1 -> #0.0
Stream #0.2 -> #0.1
Press [q] to stop, [?] for help
[aac @ 0x1012400] channel element 2.10 is not allocated
Error while decoding stream #0.2
frame= 85 fps= 39 q=0.0 Lsize= -0kB time=00:00:02.51 bitrate= -0.1kbits/s dup=13 drop=0
video:0kB audio:472kB global headers:0kB muxing overhead -100.004552%
comment:6 Changed 2 years ago by cehoyos
- Reproduced by developer set
Reproducible on Linux (not on OS X where gas-preprocessor.pl is used).
comment:7 follow-up: ↓ 8 Changed 2 years ago by reimar
Are you sure this isn't specific to your compiler version?
LLVM and possible also gcc made an ABI change that optimized the TOC register r2 away.
The current code thus can't work anymore.
comment:8 in reply to: ↑ 7 Changed 2 years ago by kim
Replying to reimar:
Are you sure this isn't specific to your compiler version?
LLVM and possible also gcc made an ABI change that optimized the TOC register r2 away.
The current code thus can't work anymore.
I think that's what is happening. However I'm using a fairly standard compiler (which I think is the default in any recent distro: gcc 4.5.x) Also tested with gcc 4.4.5 with the same results so I guess this change in gcc is quite old. However I came up with a proof of concept patch to load the address of local symbols in a pic friendly way. (I use the standard trick of computing the space between a fixed label and the symbol to be loaded and then add this to the adresss of the fixed label which is determined at runtime using a phony branch instruction). I've tested it on linux ppc 32bit and I'm now able to play the above sample without problems.
diff --git a/libavcodec/ppc/asm.S b/libavcodec/ppc/asm.S
index e372d53..30e0953 100644
--- a/libavcodec/ppc/asm.S
+++ b/libavcodec/ppc/asm.S
@@ -67,7 +67,11 @@ X(\name):
.macro movrel rd, sym
#if CONFIG_PIC
- lwz \rd, \sym@got(r2)
+ bcl 20, 31, lab_pic_\@
+lab_pic_\@:
+ mflr \rd
+ addis \rd, \rd, (\sym - lab_pic_\@)@ha
+ addi \rd, \rd, (\sym - lab_pic_\@)@l
#else
lis \rd, \sym@ha
la \rd, \sym@l(\rd)
comment:9 Changed 2 years ago by cehoyos
Apple gcc-4.2 with --enable-pic fails compilation:
$ make AS libavcodec/ppc/fft_altivec_s.o libavcodec/ppc/fft_altivec_s.S:747:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:747:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:787:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:787:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:794:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:794:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1282:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1282:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1322:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1322:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1329:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1329:Invalid mnemonic 'got(r2)' make: *** [libavcodec/ppc/fft_altivec_s.o] Error 1
comment:10 Changed 2 years ago by kim
This is strange, my patch is removing the @got(r2), not adding it. I don't know why it remains
when compiled on osx. Maybe it's due to gas-preprocessor.pl or something.
comment:11 Changed 2 years ago by cehoyos
(Sorry, I had originally written my comment before you posted your patch.)
Apple gcc 4.2 fails with current, unpatched FFmpeg git with --enable-pic
comment:12 Changed 2 years ago by cehoyos
With the patch, compilation succeeds with Apple gcc 4.2 and decoding also works on Linux.
comment:13 Changed 2 years ago by michael
what performance effect does this patch have?
it could slow things down for cases that dont need the extra code
comment:14 follow-up: ↓ 15 Changed 2 years ago by kim
what performance effect does this patch have?
it could slow things down for cases that dont need the extra code
Depends on what you mean by 'cases that don't need the extra code'. If you compile without PIC, the patch is not used (#ifdef) so no performance loss here, and if you compile with -fPIC and a 'recent' gcc nothing works anyway (gcc 4.2 which triggers the bug was released in 2007). I can also confirm that ASM + patch remains much faster than using the plain C version of the function.
By the way, gcc is using the same trick to load global symbol in -fPIC code nowadays:
static int GLOBAL[3] = {1, 2, 3};
int f(int x) { return GLOBAL[x]; }
Compiling with gcc -O0 -S -c foo.c one can see that the address of GLOBAL is computed as so:
bcl 20,31,.L2
.L2:
mflr 30
addis 30,30,.LCTOC1-.L2@ha
addi 30,30,.LCTOC1-.L2@l
where .LCTOC1 is the label marking the begining of the TOC section. This is exactly what the patch does.
Anyway the movrel macro at issue is always followed by *several* memory accesses so one extra unconditional jump can hardly impact performances. I also belive this patch to be backward compatible so the code can still be compiled with older versions of gcc/as (this is untested though).
What the patch does not handle at the moment is the PPC64 architecture (I'm not even sure this arch is affected by this bug).
comment:15 in reply to: ↑ 14 Changed 2 years ago by michael
- Status changed from open to closed
- Resolution set to fixed
Patch applied
i assume this fixes this ticket
Thanks for the patch



bt, disass, info all-registers of ffplay_g