Return-Path: Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 59852 invoked by uid 500); 30 Nov 2001 04:32:33 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 59841 invoked from network); 30 Nov 2001 04:32:33 -0000 Message-ID: From: victork@hecl.com.hk To: dev@httpd.apache.org Subject: race conditions Date: Fri, 30 Nov 2001 12:32:03 +0800 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi all, i am running apache 1.3.20 on linux suse 6.4, and as i was reading the documentations from the apache.org web site, i noticed that it mentioned there exists several race conditions. however, it only says that it relates to the restart signals (SIGHUP/SIGUSR1) and the die signal (SIGTERM), it did not however specify how and where is the race condition. looking into apache 1.3.20's source code, one place i found that might have race condition is in the ap_unblock_alarms function (pasted below) after --alarms_blocked statement and before the ++alarms_blocked statement. the race condition comes in if exit_after_unblock is true, and a SIGALRM signal comes in in that interval, this will bring us to the timeout function (though we are suppose to exit). however in that case, the server would still exit normally in timeout(int sig), so i don't really see the problem here. another suspicious place is in read_request_line, a request could be read from getline function, and before we can disable the SIGUSR1 signal, a graceful restart signal (SIGUSR1) comes in, therefore we kill the child process even though we got a request to process. anyway, my question is, did i correctly interpreted the race conditions (in ap_unblock_alarms and read_request_line) and what other race conditions were the documentions referring to? API_EXPORT(void) ap_unblock_alarms(void) { --alarms_blocked; if (alarms_blocked == 0) { if (exit_after_unblock) { /* We have a couple race conditions to deal with here, we can't * allow a timeout that comes in this small interval to allow * the child to jump back to the main loop. Instead we block * alarms again, and then note that exit_after_unblock is * being dealt with. We choose this way to solve this so that * the common path through unblock_alarms() is really short. */ ++alarms_blocked; exit_after_unblock = 0; clean_child_exit(0); } if (alarm_pending) { alarm_pending = 0; timeout(0); } } } In addition, I am wondering about the following: why the check on exit_after_unblock doesn't appear in lingerout as well (as in timeout function) since a similar race condition (mentioned above in ap_unblock_alarms) exists in lingerout too? is it because current_conn is never NULL once the alarms are enabled??? if so, why do we bother to check current_conn in both the timeout function and lingerout function? seems to me that current_conn is always non null whenever alarms are enabled. when would you want to use ap_block_alarms? Is it to make sure that the code which deals with the memory pools does not get interrupted by any signals?? I am concerned about this because I am trying to write a similar server (but of course alot more simpler), and I want to know whether I would need such blockings. Thanks in advance, Victor