apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject Re: Altogether Broken OtherChild logic
Date Thu, 30 Jan 2003 21:48:40 GMT
At 01:30 PM 1/30/2003, Bill Stoddard wrote:
>William A. Rowe, Jr. wrote:
>
>>I belive I've deciphered the "RotateLogs doesn't work for access logs
>>on Windows" Apache 2.0.44 bug.  It's actually many bugs in conformance.
>>
>>Finally, it looks like apr_proc_other_child_read is the function we *really* wanted
>>to use within the health check.  But it seems all of these apr_proc_other_child
>>functions are really misdocumented within APR.  Would someone step up and
>>spell out exactly what they are *supposed* to be doing within unix, and then we
>>can discuss how to make them portable to Win32?  It seems we have too much
>>bubblegum and bailing wire holding them together, and the fixes that I made to
>>do *exactly* what Unix was doing has killed the WinNT mpm.

>What we want to do in the winnt MPM maintenance loop is peridically check 
>to see of the process is still alive. It it is dead, it needs to be restarted (ie, reliable

>piped logs). apr_proc_other_child_read was used in the Unix side of the house 
>to do a periodic read of the pipe to an OC.  On Unix, you can tell if the process 
>on the other end is still alive by doing a read on the pipe. Cannot do that with pipes

>on Windows, so we use apr_proc_other_child_check instead.  

We never check the either end of OC pipes on Unix today in Apache 2.0.
Because we 'hang on' to the pipe and send that same pipe to the next 
generation of child, we catch the death of the child instead.

However, the dying child is caught by apr_proc_wait_all_procs().  This is
bubbled to ap_wait_or_timeout which bubbles the offending PID to the
run_mpm main loop.

The apr_proc_wait_all_procs() is simply a apr_proc_wait(-1) flavor of the same.
Win32 could do the very same thing.  However, I'm considering whether or not
to drive this; will WaitForMultipleObjects (suffering the usual issues of 64 wait 
events) or if it might make more sense to setup RegisterWaitForSingleObject 
on each child process handle.  This may give us most similar results to Unix.

Bug number one, in my mind, was the abuse of the function that was named
apr_proc_other_child_read().  That fn must be called apr_proc_other_child_died().
Only the name of that function changes.

The second abused function was...

>I think we need one function (call it what you will but apr_proc_other_child_check 
>seems okay to me) that checks the status of an OC and performs an action 
>(specified on the call to apr_proc_other_child_check) based on the status.  

The *existing* apr_proc_other_child_check doesn't do at all what you describe on
Unix.  We only call it from the mpm and it's only as we are shutting down or
restarting.  *That* function is misnamed, I'm thinking apr_proc_other_child_restart().
This function would continue to be called as the MPM recycles.

The *new* function you describe (it didn't exist today) would have been nicely named
_check() but that would be a namespace clash.  So I'm thinking _refresh() or some
such that would check the health of the other children and invoke their callbacks only
if something bad has happened.  This might even be on a periodic basis on Unix, if
we discover that some platforms aren't reliably reporting apr_proc_other_child_died()
due to apr_proc_wait_all_procs() missing some signals.

>I am guessing that the windows MPM is whacking the piped logger because 
>ocr->proc->hproc is somehow hosed.

No, that is working fine.  It is whacking it because I modified the code in _check()
to do exactly the same thing on Win32 as it does on Unix.  (Yes, on unix we whack
the children from mod_log_config each restart or shutdown.  Good?  Maybe not but
until this behaves correctly it remains a useful test case for me.)  So now Win32
would whack the child once per second based on WinNT MPM's calls to _check().

And the code in _check() - al la _restart(), the code in _read() al la _died(), and the 
new code in a _refresh() must behave the same between Unix and Win32, or nobody 
will ever successfully code portable modules.

So, renaming _check() to restart() and adding the _refresh() we will safely be able
to wander between the two platforms and not run into this again.  Sure, _check()
sounded like a good place to put the logic you added, but it was entirely different
in purpose from the Unix code.  That's unacceptable.  I fault the docs, again, for
not providing sufficient details.

Sure, the MPM details can be different, but they should be *converging* around
common functions, not divirging wildly.  If my theory on RegisterWaitForSingleObject
to signal apr_proc_wait_all_procs() actually works, we might just see that convergence.

I started this thread not to ask what Win32 was doing, but what the essential design
from Unix was.  Once I understand that, I stand a real chance to get Win32 identical
or so similar you don't notice the discrepancies.  This may involve an extra function,
then again it might not.  

But Unix sure didn't agree with the documented apr_thread_proc.h documentation, 
and that's what really drove me nuts these few days as I continue to try untangling
the problem.

Bill



Mime
View raw message