Return-Path: Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 9456 invoked by uid 500); 16 May 2001 11:38:06 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 9099 invoked from network); 16 May 2001 11:38:03 -0000 X-Authentication-Warning: adsl-77-241-65.rdu.bellsouth.net: trawick set sender to trawickj@bellsouth.net using -f Sender: trawick@bellsouth.net To: Subject: Re: Other Child processing References: <024801c0dd79$5eb64a00$bd431b09@sashimi> From: Jeff Trawick Date: 16 May 2001 07:34:06 -0400 In-Reply-To: <024801c0dd79$5eb64a00$bd431b09@sashimi> Message-ID: Lines: 92 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N "Bill Stoddard" writes: > apr_proc_other_child_register() > apr_proc_other_child_unregister() > Register an OC and unregister an OC. These are the only OC functions that > make sense... continuing... > > apr_proc_probe_writable_fds() > This function seems worthless and doesn;t appear to be used for anything. > You cannot make any meaningful diagnosis on an OC if the fds are not > writable. Perhaps the pipe is full, perhaps the process is dead, ???. No > way of knowing how to handle this case. agreed... > apr_proc_other_child_read(apr_proc_t *pid, int status) > Simply results in the call to *ocr->maintenance(APR_OC_REASON_DEATH, ...) > which is totally non-intuitive behaviour given the name of the funtion. > Presumably apr_proc_other_child_read is called after you discover a OC has > failed and that calling maintenance with APR_OC_REASON_DEATH is the right > thing to do. Presumably. I don't know if the problem is with the semantics or just the function name. > apr_proc_other_child_check() > Runs the list of OCs and checks to see if they are dead or alive and calls > *ocr->maintenance based on whether the OC is dead or alive. threaded.c calls > this routine multiple times during -shutdown-, after the process group has > been signaled to die. What are we checking and why? This is just > goofy... I understand the logic for when they are dead but not the logic for when they are alive. I think we agree on this. > Straw man proposal... > 1. apr_proc_other_child_*register() > Leave the register and unregister functions the same > > 2. apr_proc_other_child_check() > It makes sense to me to use this routine when you want OCs to stay up and > alive. You would call it during idle_server_maintenance and it would detect > when an OC has dies and call maintenance to restart it. The Unix > implementation of Apache HTTP would probably not use this routine as the MPM > parent processes use other mechanisms to detect child death. This would be > good for Windows to detect child death. Hmmm... If the Unix MPMs call this in idle_server_maintenance() then they can ignore the death of processes they don't really know about. If they don't ignore the death of such processes they'll need some other API to see if a newly-deceased process was a registered other-child. It seems simpler just to call apr_proc_other_child_check(). Oh, I see that this "other API" is apr_proc_other_child_maintenance(), described below. > 3. apr_proc_probe_writable_fds() > Remove it and all references to it. yep > 4. apr_proc_other_child_shutdown() > Signals each OC to shutdown. When the OC has died, calls maintenance > reporting OC DEATH (rather than LOST, which would imply a restart) So how long do we sit in here? As long as necessary? On Unix we should do SIGTERM followed by SIGKILL a five or so seconds later if the process hasn't gone away. This would solve the apparent Solaris 2.6 SNAFU affecting Apache 1.3+rotatelogs which Greg Ames mentioned on new-httpd yesterday. > 5. apr_proc_other_child_maintenance(apr_proc_t *pid, action) > Perform specefic OC maintenance. If threaded.c detects that an OC has gone > down, it would call... > apr_proc_other_child_maintenance(apr_proct *pid, APR_OC_REASON_LOST) to > cause the appopriate maintenance routine to be called. I guess you mean "If a Unix MPM detects that some child process has died and it isn't a server process, then it would call apr_proc_other_child_maintence() which will first see if it is a registered other-child and if so cause the appropriate maintenance routine to be called." Another missing piece is auto-cleanup of other child registrations when the pool associated with the registration goes away. I posted a patch for this a couple of weeks ago. -- Jeff Trawick | trawickj@bellsouth.net | PGP public key at web site: http://www.geocities.com/SiliconValley/Park/9289/ Born in Roswell... married an alien...