Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@httpd.apache.org
Received-SPF: neutral (hermes.apache.org: local policy)
Subject: Watchdog code for Apache
From: Nick Kew <nick@webthing.com>
To: dev@httpd.apache.org
Content-Type: text/plain
Date: Sat, 04 Jun 2005 15:02:41 +0100
Message-Id: <1117893761.24898.38.camel@asgard>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit

A little while back, I hacked up a quick&dirty experimental watchdog 
module.  It forks a watchdog process in the pre_mpm hook, which then
watches the scoreboard and kills any process in which some request
has taken more than some predefined time.

Currently this is limited to killing processes, which puts it
at the same level of usefulness as mod_watchcat.  My attempts
to do anything better than that in a signal handler haven't
gone anywhere useful, and this looks like an unpromising
approach.  It'll work for prefork, but as far as other MPMs
are concerned it looks like a dead end.  What I'd like to do
is rather than kill the process, terminate the errant thread,
including winding down its pools.

It also seems better for the watchdog code to run in Apache's
master process than to fork off a separate process.

My current thinking is to use ap_wait_or_timeout and:

* go through the scoreboard looking for threads tied up in a
  request that's gone on too long.
* send SIGUSR2 to the process
* return with pid.pid set to the process for the MPM to deal with

Then the per-process signal handler is reduced to setting a flag
for the MPM to deal with.  Now the MPM can at worst terminate it
cleanly, perhaps using the graceful restart code on the process.

Questions:
* Any objections in principle to adding watchdog code in this manner?
* Does this plan make sense?
* Is there a better plan that'll enable me to get down to thread level
  and stop just the errant thread?  Perhaps the above with an optional
  thread-shutdown hook somewhere?

-- 
Nick Kew