httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject alloca (was Re: cvs commit: apache-1.3/src/modules/standard mod_log_referer.c)
Date Sun, 24 May 1998 23:11:45 GMT


On Fri, 22 May 1998, Alexei Kosut wrote:

> My understanding was that one should never use alloca. Not if one expects
> code to work correctly, or at all.

Nah, it's safe, it's just not portable.

> One thing to note is that successful use of alloca() depends on the
> compiler intervening on alloca()'s behalf. I personally would not suspect
> that all compilers correctly support it. gcc does (actually, gcc supports
> alloca as a builtin, rather than calling the system library, so it
> probably works all right), but I wouldn't guarantee that every compiler
> does. And I'd be wary of Apache timeouts' use of longjmps and that
> interaction with alloca.

On traditional systems on which the stack just grows down (or up) from
some fixed point alloca() is trivial.  The main reason the compiler needs
to know about it is that it must set up a stack frame to use it.  In
x86 parlance, the stack frame setup/destruction looks like this:

    pushl %ebp
    movl %esp,%ebp
    subl #-20,%esp	! create a 20 byte stack frame
    ... rest of the function goes here, and references negative offsets
    ... from %ebp to access elements of the stack frame
    popl %ebp

In this context you can (almost) implement alloca just like this:

    subl #-nnnn,%esp
    movl %esp,%eax

Where nnnn is a 4-byte aligned size, and the resulting pointer is %eax.

But this doesn't deal with growing the stack, which you have to do in
page-sized increments (4k on x86).  So if the allocation is larger than
4095 bytes you need to "touch" each 4k page once to ensure the kernel
grows the stack.  (The stack has a "guard" page at the bottom, which
is not-present, so any read or write to it will fault, and the kernel
will then add one more page to the stack and move the guard page one
further down.)

gcc's use of the builtin allows it to handle a bunch of things easily.
It can figure out if the size arg is a constant, and if so it'll do the
rounding at compile time, and if it's smaller than a page-size allocation
it can skip the stack growing loop.  You could actually implement it with
an inline assembly function in gcc, but you wouldn't get those features.

> If we want temporary memory, why not just use malloc() and free()?
> Or a large defined buffer on the stack? That seems to work well for the
> rest of the world... And besides, using palloc() for a couple of bytes of
> memory that won't be freed for a bit is probably all right. Requests go
> quickly.

palloc() is actually not much more than the above normally... except that
it incurs a function call overhead.  On non-x86 architectures, it's
possible that alloca() will cause a function to end up with a stack
frame it wouldn't otherwise need -- but on x86 almost most functions have
stack frames because the x86 has so few registers the compiler needs
somewhere to spill temporaries.

Using a large buffer on the stack is about the same performance-wise as
alloca, except that it gives you fixed lengths for things, and tends to
make stacks that are far larger than they need to be.  I still haven't
done the analysis, but I need to figure out what stack depth we achieve
regularly in apache, because plopping 8k buffers on the stack left and
right the way we do can chew up a bunch of pages... and there'll be little
active data on the stack.  This is fine if you've got one process, but
when every process (thread) needs a dozen extra pages because of this
sloppiness... well, it adds up.

> And thanks to the nice Sun people, from the Solaris alloca(3C) man page:
> 
> "alloca() is machine-, compiler-, and most of all, system-dependent. Its
> use is strongly discouraged."

feh.  alloca() falls into a category where I say "if a system doesn't
support it, then folks shouldn't be building high-end webservers on it."
You can "emulate" alloca() to a high degree of accuracy for systems
that don't support it.  But, it's not something we should start using
now... for 2.0 I wouldn't be adverse to us starting to use it.

Dean


Mime
View raw message