Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@httpd.apache.org
Received-SPF: pass (nike.apache.org: domain of trawick@gmail.com designates
 209.85.220.225 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=JWF3yBFQ+9/RurCV34RF9cDAs6AM4IFiVRRK3vGyw6vUU8HewuFsiElMOYPxr5QB5k
         EYlbWByuXTDkPVjDUgW/rkCPRczZpQV0p/nmEkBQZ+xekc077AYIINa4J7xazT/fFmAh
         VkciS32gwCPiQ+8CGU/Wci1vMiQuSNshG8rx8=
MIME-Version: 1.0
In-Reply-To: <ad8a87120909141327t3d22f7c2tb619f10df775778@mail.gmail.com>
References: <ad8a87120909141327t3d22f7c2tb619f10df775778@mail.gmail.com>
Date: Tue, 15 Sep 2009 08:07:43 -0400
Message-ID: <cc67648e0909150507offd0725vb5d2dcd2943f125a@mail.gmail.com>
Subject: Re: accept mutex failure causes fork bomb
From: Jeff Trawick <trawick@gmail.com>
To: dev@httpd.apache.org
Content-Type: multipart/alternative; boundary=00148531a39f2e8bb304739ca121

--00148531a39f2e8bb304739ca121
Content-Type: text/plain; charset=ISO-8859-1

On Mon, Sep 14, 2009 at 4:27 PM, Greg Ames <ames.greg@gmail.com> wrote:

> I'm trying to debug a problem where apparently the accept mutex went bad on
> a z/OS system running the worker MPM.  I'm guessing that some memory that we
> use for the semaphore got clobbered but don't have proof yet.  The error log
> looks like:
>
> [Mon Sep 07 08:01:59 2009] [emerg] (121)EDC5121I Invalid argument.:
> apr_proc_mutex_unlock failed. Attempting to shutdown process gracefully.
>

Could it be some system limit exceeded, like max waiters on a mutex?  (IIRC
some OSs require tuning of SysV semaphores.  (Older Solaris comes to mind.))


> [Mon Sep 07 08:02:01 2009] [emerg] (121)EDC5121I Invalid argument.:
> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
> [Mon Sep 07 08:02:02 2009] [emerg] (121)EDC5121I Invalid argument.:
> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
> [Mon Sep 07 08:02:02 2009] [emerg] (121)EDC5121I Invalid argument.:
> apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.
> [...]
>
> The rest of the error log is filled with lock failures.  Looking at the
> time stamps, you can see that perform_idle_server_maintenance went into
> exponential expansion, maxing out at about 24 lock failures per second.
> Unfortunately the fork()s were faster than z/OS could terminate the
> processes that had detected the mutex problem, so after forking 978 httpd
> children, the system ran out of real memory and had to be IPLed.
>
> One of my colleagues asked why ServerLimit 64 didn't stop the fork bomb.
> Good question.  The reason is that the error path calls signal_threads()
> which causes the child to exit gracefully.  The listener thread sets
> ps->quiescing on the way out, which allows the "squatting" logic in
> perform_idle_server_maintenance to take over the scoreboard slot before the
> previous process has completely exited, bypassing the ServerLimit throttle.
>
> This raises several ideas for improvement:
>
> * Should we do clean_child_exit(APEXIT_CHILDSICK or CHILDFATAL) for this
> error?  We have a previous fix to detect accept mutex failures during
> restarts and tone down the error messages.  I don't recall seeing any false
> error messages after that fix.  We could also use requests_this_child to
> detect if this process has ever successfully served a request, and only do
> the clean_child_exit if it hasn't.
>

So nasty failures prior to successfully accepting a connection bypass
squatting?  Good.

CHILDSICK or CHILDFATAL in that case?  In this example it probably wasn't
going to get any better.  However, I think it is reasonably likely that
child process n+1 encounters some sort of resource limit, so CHILDSICK seems
better.


>
> * Should we yank the squatting logic?  I think it is doing us more harm
> than good.  IIRC it was put in to make the server respond faster when the
> workload is spikey.  A more robust solution may be to set Min and
> MaxSpareThreads farther apart and allow ServerLimit to be enforced
> unconditionally.  disclaimer: I created ps->quiescing, so I was an
> accomplice.
>

My understanding is that squatting is required to deal with long-running
requests that keep a child trying to exit from actually exiting, thus tying
up a scoreboard process indefinitely.  (We've had the assumption that we
didn't want to drastically overallocate scoreboard to accommodate a bunch of
children with only a few threads handling requests.)

Is it reasonable to have up to MaxClients worth of squatting like we have
now?  (I think that is what we allow.)  No, I don't think so.

Should we axe squatting, respect ServerLimit, and thus make the admin raise
ServerLimit to accommodate exiting processes which are handling long-running
requests (and waste a bit of shared memory at the same time)?  Maybe ;)
 That change seems a bit drastic, but I'm probably just scared of another
long period of time before I halfway understand how it behaves in the real
world.


> * Does it make sense to fork more than MaxSpareThreads worth of child
> processes at a time?  MaxSpareThreads was 75 in this case, but we tried to
> fork at least 600 threads (same as MaxClients) worth of child processes in
> one pass of perform_idle_server_maintenance.
>

That's a good idea.


>
> This applies to worker and event; some of it may also apply to prefork.
> I'd appreciate thoughts and suggestions before committing anything.
>
> Thanks,
> Greg
>


-- 
Born in Roswell... married an alien...

--00148531a39f2e8bb304739ca121
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div class=3D"gmail_quote">On Mon, Sep 14, 2009 at 4:27 PM, Greg Ames <span=
 dir=3D"ltr">&lt;<a href=3D"mailto:ames.greg@gmail.com">ames.greg@gmail.com=
</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin=
:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I&#39;m trying to debug a problem where apparently the accept mutex went ba=
d on a z/OS system running the worker MPM.=A0 I&#39;m guessing that some me=
mory that we use for the semaphore got clobbered but don&#39;t have proof y=
et.=A0 The error log looks like:<br>

<br>[Mon Sep 07 08:01:59 2009] [emerg] (121)EDC5121I Invalid argument.: apr=
_proc_mutex_unlock failed. Attempting to shutdown process gracefully.
<br></blockquote><div><br></div><div>Could it be some system limit exceeded=
, like max waiters on a mutex? =A0(IIRC some OSs require tuning of SysV sem=
aphores. =A0(Older Solaris comes to mind.))</div><div>=A0</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid=
;padding-left:1ex;">
[Mon Sep 07 08:02:01 2009] [emerg] (121)EDC5121I Invalid argument.: apr_pro=
c_mutex_lock failed. Attempting to shutdown process gracefully.
<br>[Mon Sep 07 08:02:02 2009] [emerg] (121)EDC5121I Invalid argument.: apr=
_proc_mutex_lock failed. Attempting to shutdown process gracefully.
<br>[Mon Sep 07 08:02:02 2009] [emerg] (121)EDC5121I Invalid argument.: apr=
_proc_mutex_lock failed. Attempting to shutdown process gracefully.<br>[...=
]<br><br>The rest of the error log is filled with lock failures.=A0 Looking=
 at the time stamps, you can see that perform_idle_server_maintenance went =
into exponential expansion, maxing out at about 24 lock failures per second=
.=A0 Unfortunately the fork()s were faster than z/OS could terminate the pr=
ocesses that had detected the mutex problem, so after forking 978 httpd chi=
ldren, the system ran out of real memory and had to be IPLed.<br>

<br>One of my colleagues asked why ServerLimit 64 didn&#39;t stop the fork =
bomb.=A0 Good question.=A0 The reason is that the error path calls signal_t=
hreads() which causes the child to exit gracefully.=A0 The listener thread =
sets ps-&gt;quiescing on the way out, which allows the &quot;squatting&quot=
; logic in perform_idle_server_maintenance to take over the scoreboard slot=
 before the previous process has completely exited, bypassing the ServerLim=
it throttle.<br>

<br>This raises several ideas for improvement:<br><br>* Should we do clean_=
child_exit(APEXIT_CHILDSICK or CHILDFATAL) for this error?=A0 We have a pre=
vious fix to detect accept mutex failures during restarts and tone down the=
 error messages.=A0 I don&#39;t recall seeing any false error messages afte=
r that fix.=A0 We could also use requests_this_child to detect if this proc=
ess has ever successfully served a request, and only do the clean_child_exi=
t if it hasn&#39;t.<br>
</blockquote><div><br></div><div>So nasty failures prior to successfully ac=
cepting a connection bypass squatting? =A0Good.</div><div><br></div><div>CH=
ILDSICK or CHILDFATAL in that case? =A0In this example it probably wasn&#39=
;t going to get any better. =A0However, I think it is reasonably likely tha=
t child process n+1 encounters some sort of resource limit, so CHILDSICK se=
ems better.</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;">
<br>* Should we yank the squatting logic?=A0 I think it is doing us more ha=
rm than good.=A0 IIRC it was put in to make the server respond faster when =
the workload is spikey.=A0 A more robust solution may be to set Min and Max=
SpareThreads farther apart and allow ServerLimit to be enforced uncondition=
ally.=A0 disclaimer: I created ps-&gt;quiescing, so I was an accomplice.<br=
>
</blockquote><div><br></div><div>My understanding is that squatting is requ=
ired to deal with long-running requests=A0that keep a child trying to exit =
from actually exiting, thus tying up a scoreboard process indefinitely. =A0=
(We&#39;ve had the assumption that we didn&#39;t want to drastically overal=
locate scoreboard to accommodate a bunch of children with only a few thread=
s handling requests.)</div>
<div><br></div><div>Is it reasonable to have up to MaxClients worth of squa=
tting like we have now? =A0(I think that is what we allow.) =A0No, I don=
9;t think so.</div><div><br></div><div>Should we axe squatting, respect Ser=
verLimit, and thus make the admin raise ServerLimit to accommodate exiting =
processes which are handling long-running requests (and waste a bit of shar=
ed memory at the same time)? =A0Maybe ;) =A0That change seems a bit drastic=
, but I&#39;m probably just scared of another long period of time before I =
halfway understand how it behaves in the real world.</div>
<div><br></div><div><br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>* Does it make sense to fork more than MaxSpareThreads worth of child p=
rocesses at a time?=A0 MaxSpareThreads was 75 in this case, but we tried to=
 fork at least 600 threads (same as MaxClients) worth of child processes in=
 one pass of perform_idle_server_maintenance.=A0 <br>
</blockquote><div><br></div><div>That&#39;s a good idea.</div><div>=A0</div=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex;">
<br>This applies to worker and event; some of it may also apply to prefork.=
=A0 I&#39;d appreciate thoughts and suggestions before committing anything.=
<br><br>Thanks,<br><font color=3D"#888888">Greg<br>
</font></blockquote></div><br><br clear=3D"all"><br>-- <br>Born in Roswell.=
.. married an alien...<br>

--00148531a39f2e8bb304739ca121--