httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Hess <>
Subject Re: performance: using mlock(2) on httpd parent process
Date Wed, 20 Mar 2002 19:08:51 GMT
On Thu, 21 Mar 2002, Stas Bekman wrote:
> > On Wed, 20 Mar 2002, Stas Bekman wrote:
> > 
> >>mod_perl child processes save a lot of memory when they can share 
> >>memory with the parent process and quite often we get reports from 
> >>people that they lose that shared memory when the system decides to 
> >>page out the parent's memory pages because they are LRU (least 
> >>recently used, the algorithm used by many memory managers).
> >>
> > 
> > I'm fairly certain that this is not an issue.  If a page was shared 
> > COW before being paged out, I expect it will be shared COW when paged 
> > back in, at least for any modern OS.
> But if the system needs to page things out, most of the parent process's
> pages will be scheduled to go first, no? So we are talking about a
> constant page-in/page-out from/to the parent process as a performance
> degradation rather than memory unsharing. Am I correct?

The system is going to page out an approximation of the
least-recently-used pages.  If the children are using those pages, then
they won't be paged out, regardless of what the parent is doing.  [If the
children _aren't_ using those pages, then who cares?]

> > [To verify that I wasn't talking through my hat, here, I just verified
> > this using RedHat 7.2 running kernel 2.4.9-21.  If you're interested in my
> > methodology, drop me an email.]
> I suppose that this could vary from one kernel version to another.

Perhaps, but I doubt it.  I can't really do real tests on older kernels
because I don't have them on any machines I control, but I'd be somewhat
surprised if any OS which runs on modern hardware worked this way.  It
would require the OS to map a given page to multiple places in the
swapfile, which would be significant extra work, and I can't think of any
gains from doing so.

> I'm just repeating the reports posted to the mod_perl list. I've never
> seen such a problem myself, since I try hard to have close to zero swap
> usage.

:-).  In my experience, you can get some really weird stuff happening when
you start swapping mod_perl.  It seems to be stable in memory usage,
though, so long as you have MaxClients set low enough that your maximum
amount of committed memory is appropriate.  Also, I've seen people run
other heavyweight processes, like mysql, on the same system, so that when
the volume spikes, mod_perl spikes AND mysql spikes.  A sure recipe for

> [Yes, please let me know your methodology for testing this]

OK, two programs.  bigshare.c:

#include <stdlib.h>
#include <signal.h>
#include <unistd.h>

#define MEGS 256
static char *mem = NULL;
static char vv = 0;

static void handler(int signo)
    char val = 0;
    unsigned ii;
    signal(signo, handler);
    for (ii=0; ii<MEGS*1024*1024; ii+=4096) {
        val += mem[ii];
    vv = val;

int main(int argc, char **argv)
    mem = calloc(1, MEGS*1024*1024);

    signal(SIGUSR1, handler);

    while(1) {
    return 0;

and makeitswap.c:

#include <stdlib.h>

int main(int argc, char **argv)
    char *mem = calloc(1, 384*1024*1024);
    return 0;

These both compile under RedHat 7.2, you might have to adjust the #include
directives for other systems.  Adjust the MEGS value in bigshare.c to be
big enough to matter, but not so big that it causes bigshare itself to
swap.  I chose 1/2 of my real memory size.  The 384 in makeitswap.c is 3/4
of my real memory, so it pushes tons of stuff into swap.

Run bigshare.  Use ps or something appropriate to determine that, indeed,
all four bigshare processes are using up 256M of memory, but it's all

Then, run makeitswap.  All of the bigshare processes should partly or
fully page out.  Afterwards I I was seeing RSS from 260k to 1M on the
bigshare processes.

Then, kill -USR1 one of the bigswap processes.  This causes the process to
re-read all of the memory it earlier allocated, thus it should page in
256M or so.  ps or top should show the RSS rising as it swaps back in.  
You can also use "vmstat 1" to watch it happen (watch the Swap/si column).  
On some systems you may need to use iostat.  More than likely your system
response also goes to heck, because it's spending so much time swapping
data in.  bigswap should end up with RSS about 256M, again.

Then, kill -USR1 another of the bigswap processes.  On my system, this
happened much faster than the first time.  Also, I saw only minimal
swapins in vmstat (128 or so per second, versus >10,000 per second for the 
-USR1 against the first process).  Send -USR1 to other bigshare processes, 
same results.  You can verify that the pages are shared with ps or 

> >>Therefore my question is there any reason for not using mlockall(2) in
> >>the parent process on systems that support it and when the parent 
> >>httpd is started as root (mlock* works only within root owned 
> >>processes).
> > 
> > I don't think mlockall is appropriate for something with the heft of
> > mod_perl.
> > 
> > Why are the pages being swapped out in the first place?  Presumably
> > there's a valid reason. 
> Well, the system coming close to zero of real memory available. The
> parent process starts swapping like crazy because most of its pages are
> LRU, slowing the whole system down and if the load doesn't go away, the
> system takes a whirl down to a halt.

I can think of a couple possible causes.  One is the MaxClients setting.  
Say you have MaxClients set to 50, but on a standard day you never need 
more than 15 servers.  Then you get listed on slashdot.  As you pass, oh, 
30 simultaneous servers, you start thrashing, so requests take longer to 
process, so you immediately spike to your MaxClients of 50 and the server 
goes right down the drain.  If you shut things down and start them back 
up, it's likely you'll immediately spike to 50 again, and back down the 
drain it goes.

I've found that it's _super_ important to make certain you've pushed
mod_perl to MaxClients under your standard conditions.  Once you start
swapping, you're doomed, unless the traffic was a very short spike.

Another possible cause is that you have another heavyweight server running
on the same server.  As I indicated above, I've seen people do this with
things like mysql.  Since high mod_perl traffic implies high mysql
traffic, it's just like having MaxClients set too high, but twice as bad!

Another possible cause is that the OS is aggressively grabbing pages for
the filesystem cache.  It's possible that tuning down the size of the
filesystem cache would be appropriate - many webservers have a very large
maximum amount of data they might server, but a very small working set.

Really, though, all of this really tends to come down to MaxClients.  The
database load is proportional to the number of clients, the filesystem
load is proportional to the number of clients, everything is proportional
to the number of clients.  MaxClients is as close to a silver bullet as
I've seen.  People tend to set MaxClients based on their expected load,
rather than on how much load their server can handle. You just have to
arrange for your maximum committed memory usage to appropriately reflect
the memory available, or you're doomed, there's nothing the OS can do to

> > Doing mlockall on your mod_perl would result in restricting the memory
> > available to the rest of the system.  Whatever is causing mod_perl to
> > page out would then start thrashing.  Worse, since mlockall will lock
> > down mod_perl pages indiscriminately, the resulting thrashing will
> > probably be even worse than what they're seeing right now.
> Possible, I've never tried this myself and therefore asked. Someone has
> suggested using P_SYSTEM flag which is supposed to tell the system not
> to page out certain pages, but seems to be a *BSD thingy.

Really, the problem is that it's very hard to figure out which pages of
mod_perl really need this treatment.  Heck, it's very hard with any
program, but with mod_perl you have to deal with the obfuscating Perl
virtual machine layer.  That's pretty tough.  If you could just lock down
whatever is needed to keep running, that would be great...


View raw message