[FFmpeg-devel] Why 'You can only build one library type at once on MinGW'?

Trent Piepho xyzzy
Fri May 11 23:55:52 CEST 2007

On Fri, 11 May 2007, Michael Niedermayer wrote:
> On Fri, May 11, 2007 at 01:35:57PM -0700, Trent Piepho wrote:
> > The situation on x86-64 is that a single library will fit into 4GB.  In fact,
> > gcc doesn't support creating objects larger than 4GB.  It is possible to use
> > 32-bit displacements with SIB addressing in non-pic, non-relocatable objects.
> > In pic objects, it's possible to use 32-bit displacements plus rip relative
> > addressing to compute a position independent address without thunking.
> >
> > All objects in total may be greater than 4GB,
> but they are not today and when we reach that point we can still switch to PIC
> or use 64bit immedeate move to register style

Maybe not for you, but a lot of people have more than 4GB.

> > or may be loaded at an address
> > above 4GB even if the total is less.  32-bit displacements still work.
> why not just load it only in even bytes and redesign the toolchain to support
> that?

There is an advantage to using 64-bit addresses.

> > > > so text relocations are possible.  Even on ia32, were no instruction bloating
> > > > is necessary, PIC is almost always better than text relocations.
> > >
> > > its as much better as 10% fewer registers and 2x slower memory accesses due to
> > > the double indirection over the GOT
> >
> > And not dirtying every single page of the library doing relocations.
> the design where you load a library at a different address for every process
> is something very sick, hell i dont even know if linux with the current
> loader still behaves like that, its simply totally stupid, theres no need
> for that at all just give the lib a single systemwide address when its loaded
> the first time and you never have to dirty any page ...

So you want to limit someone to 4GB total address space across all
processes?  Have you even considered the memory fragmentation issues as
objects are loaded and unloaded?

> > I also suspect you don't know how PIC works on ia32.  There is no double
> > indirection.  Once the pic register is loaded, it can be used through an
> > entire function, it doesn't need to be re-loaded for each access.
> >
> > C code:
> > static int last;
> > static __attribute__((noinline)) int next(void) { return ++last; }
> >
> > non-PIC asm for next:
> >   mov    0x2c,%eax
> >   inc    %eax
> >   mov    %eax,0x2c
> >   ret
> >
> > PIC asm for next():
> >   call   25 <next+0x5>		;\
> >   pop    %ecx			;| = load ecx as pic register
> >   add    $0x3,%ecx		;/  (only done once per function)
> >
> >   mov    0x4c(%ecx),%eax	; code for ++last
> >   inc    %eax
> >   mov    %eax,0x4c(%ecx)
> >   ret
> PIC asm without static cheating:
> next:
>         call    __i686.get_pc_thunk.cx
>         addl    $_GLOBAL_OFFSET_TABLE_, %ecx
>         pushl   %ebp
>         movl    %esp, %ebp
>         movl    last at GOT(%ecx), %edx
>         movl    (%edx), %eax
>         incl    %eax
>         movl    %eax, (%edx)
>         popl    %ebp
>         ret
> i hope even someone who doesnt understand how PIC works can see that it does
> take the pointer "last" from the GOT and then dereferences it which is
> double indirection
> this happen because some wise text x86-ABI or ELF or whatever _requires_ it

You are confusing PIC with the way ELF DSOs work.  The reason this happens
for non-static variables is that they could come from another library.  One
could LD_PRELOAD a library the defined a global variable called 'last', and
then the code in the function next() would use that variable, not the one
from the same object as next().

Since it isn't known until run time which 'last' next() is going to
increment, the address is stored in the GOT.

This has nothing to with PIC vs non-PIC, it's just the way ELF defines
symbol lookup.

You could compile your DSO with the flag "-fvisibility=hidden", and then
you would get the faster PIC code a I posted with non-static global

More information about the ffmpeg-devel mailing list