Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 14325 invoked from network); 4 Jun 2005 14:01:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Jun 2005 14:01:00 -0000 Received: (qmail 87685 invoked by uid 500); 4 Jun 2005 14:00:53 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 87629 invoked by uid 500); 4 Jun 2005 14:00:52 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 87614 invoked by uid 99); 4 Jun 2005 14:00:52 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=FORGED_RCVD_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from walkham.free-online.co.uk (HELO asgard.webthing.com) (80.229.52.226) by apache.org (qpsmtpd/0.28) with ESMTP; Sat, 04 Jun 2005 07:00:50 -0700 Received: from asgard (asgard [192.168.10.2]) by asgard.webthing.com (Postfix) with ESMTP id 20D9564528 for ; Sat, 4 Jun 2005 15:02:42 +0100 (BST) Subject: Watchdog code for Apache From: Nick Kew To: dev@httpd.apache.org Content-Type: text/plain Date: Sat, 04 Jun 2005 15:02:41 +0100 Message-Id: <1117893761.24898.38.camel@asgard> Mime-Version: 1.0 X-Mailer: Evolution 2.0.4 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N A little while back, I hacked up a quick&dirty experimental watchdog module. It forks a watchdog process in the pre_mpm hook, which then watches the scoreboard and kills any process in which some request has taken more than some predefined time. Currently this is limited to killing processes, which puts it at the same level of usefulness as mod_watchcat. My attempts to do anything better than that in a signal handler haven't gone anywhere useful, and this looks like an unpromising approach. It'll work for prefork, but as far as other MPMs are concerned it looks like a dead end. What I'd like to do is rather than kill the process, terminate the errant thread, including winding down its pools. It also seems better for the watchdog code to run in Apache's master process than to fork off a separate process. My current thinking is to use ap_wait_or_timeout and: * go through the scoreboard looking for threads tied up in a request that's gone on too long. * send SIGUSR2 to the process * return with pid.pid set to the process for the MPM to deal with Then the per-process signal handler is reduced to setting a flag for the MPM to deal with. Now the MPM can at worst terminate it cleanly, perhaps using the graceful restart code on the process. Questions: * Any objections in principle to adding watchdog code in this manner? * Does this plan make sense? * Is there a better plan that'll enable me to get down to thread level and stop just the errant thread? Perhaps the above with an optional thread-shutdown hook somewhere? -- Nick Kew