Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@httpd.apache.org
Message-ID: <B1759A2EDC27D411BDCE00508B6376FD95BD71@ECREXG01>
From: victork@hecl.com.hk
To: dev@httpd.apache.org
Subject: race conditions
Date: Fri, 30 Nov 2001 12:32:03 +0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"

Hi all,


i am running apache 1.3.20 on linux suse 6.4, and as i was reading the
documentations from the apache.org web site, i noticed that it mentioned
there exists several race conditions.  however, it only says that it relates
to the restart signals (SIGHUP/SIGUSR1) and the die signal (SIGTERM), it did
not however specify how and where is the race condition.  looking into
apache 1.3.20's source code, one place i found that might have race
condition is in the ap_unblock_alarms function (pasted below) after
--alarms_blocked statement and before the ++alarms_blocked statement.  the
race condition comes in if exit_after_unblock is true, and a SIGALRM signal
comes in in that interval, this will bring us to the timeout function
(though we are suppose to exit).   however in that case, the server would
still exit normally in timeout(int sig), so i don't really see the problem
here.  another suspicious place is in read_request_line, a request could be
read from getline function, and before we can disable the SIGUSR1 signal, a
graceful restart signal (SIGUSR1) comes in, therefore we kill the child
process even though we got a request to process.  anyway, my question is,
did i correctly interpreted the race conditions (in ap_unblock_alarms and
read_request_line) and what other race conditions were the documentions
referring to?


API_EXPORT(void) ap_unblock_alarms(void)
{
    --alarms_blocked;
    if (alarms_blocked == 0) 
    {
	if (exit_after_unblock) 
	{
	    /* We have a couple race conditions to deal with here, we can't
	     * allow a timeout that comes in this small interval to allow
	     * the child to jump back to the main loop.  Instead we block
	     * alarms again, and then note that exit_after_unblock is
	     * being dealt with.  We choose this way to solve this so that
	     * the common path through unblock_alarms() is really short.
	     */
	     ++alarms_blocked;
	     exit_after_unblock = 0;
	     clean_child_exit(0);
        }
	if (alarm_pending) 
	{
	    alarm_pending = 0;
	    timeout(0);
	}
    }
}

In addition, I am wondering about the following:

why the check on exit_after_unblock doesn't appear in lingerout as well (as
in timeout function) since a similar race condition (mentioned above in
ap_unblock_alarms) exists in lingerout too?  is it because current_conn is
never NULL once the alarms are enabled???  if so, why do we bother to check
current_conn in both the timeout function and lingerout function?  seems to
me that current_conn is always non null whenever alarms are enabled.  

when would you want to use ap_block_alarms?  Is it to make sure that the
code which deals with the memory pools does not get interrupted by any
signals??  I am concerned about this because I am trying to write a similar
server (but of course alot more simpler), and I want to know whether I would
need such blockings.  


Thanks in advance,

Victor