On Sun, 24 May 1998, Alexei Kosut wrote:
> Sounds good. But if the compiler's generated code references stack data
> using positive offsets from the stack pointer (%esp? My x86 assembly
> knowledge is non-existant; the only assembly I've ever done is SPARC, and
> only a few lines of that) instead of negative offsets from the frame
> poiner (%ebp, I presume), then alloca() will mess up basically all the
> code that follows it, correct?
On x86 code can be generated both ways. If you use gcc
-fomit-frame-pointer for example, you'll get %esp relative stack frame
addressing, and get another free general register (%ebp)... until you do
something which has an "unknown" effect on %esp, such as alloca(). At
that point gcc will set up an %ebp frame and use %ebp instead.
There are other reasons to use %esp even when you've got %ebp set up, one
case (which none of the gcc family generates that I know of) is when
you're doing floating point code -- the x86 ABI doesn't require the stack
to be 8-byte aligned, it only requires 4-byte alignment. So when you
spill floating-point temporaries onto the integer stack it's 50% likely
they'll be unaligned (on an address == 4 mod 8). High end x86 compilers
can do stuff like this to avoid this unaligned performance hit:
pushl %ebp
movl %ebp,%esp
andl #-8,%esp
subl #-frame_size,%esp
which gets an 8-byte aligned %esp... and then temporaries are referenced
off of %esp and %ebp is only used to restore the stack at the end of the
function. (There's an alternative that gives you %ebp as a general
register too...)
> That sounds right. Assuming that alloca() is implemented inline. If it's a
> library function call, it might have to do all sorts of odd things to be
> able to manipulate the activation record of the previous stack frame. And
> that seems a bit iffy to me, especially if you're mixing code generated by
> a multiplicity of compilers with various options on different systems.
>
> Of course, if you're only concerned with gcc on recent revisions of major
> systems (as you've said you are), then I guess that isn't a problem.
alloca essentially has to use compiler-dependant features... it's either
implemented as a builtin (as it is in gcc), or using extensions (like
inline assembly, which is how it's done in WATCOM C).
There is no multi-compiler problem -- because alloca() does not violate the
ABI in any way. Compilers can only interoperate if they use the same
conventions...
> It seems to me that if function call overhead (which is, after all, just a
> few machine instructions) is something we need to worry about on a given
> architecture, then we should take all the CPUs and squish them, and force
> people to buy computers that make sense. Whose bright idea was it to give
> the x86 only 16 registers anyway? SPARC has 40. PA-RISC chips have 72. And
> those are only the integer ones.
x86 has 6 that are of general use, and 2 that typically have hardwired
uses (%esp, %ebp). It would be heavenly to have 16 registers... maybe
you're thinking of 680x0.
Dean
|