httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Ames <grega...@remulak.net>
Subject Re: weird problem on Solaris 2.6 - signals / shell ??
Date Fri, 08 Jun 2001 20:08:49 GMT
Justin Erenkrantz wrote:
> 
> Anybody have any thoughts on this?  I haven't received any responses.
> This bug is a big PITA on Solaris...
> 

ok, I'm starting to catch up with you on this.  Thanks a bunch for all
the digging you did :-)

> I think this patch might work (haven't tested it), but I'm not really
> sure.  

I will test this...it looks quite reasonable after reading your analysis
and some of the 1.3 code.  Unfortunately I don't have a Solaris 2.7 
account, but if it fixes Solaris 2.6 and 8, it ought to make the 2.7 
customer happy.  

>          Someone who knows the otherchild code would be able to tell
> for sure.  -- justin

1.3 otherchild didn't look too bad when I plowed thru it earlier.  
However, it seems like there was a misunderstanding of when 
to use the various OC_REASON_blah's.

> 
> Index: http_main.c
> ===================================================================
> RCS file: /home/cvspublic/apache-1.3/src/main/http_main.c,v
> retrieving revision 1.535
> diff -u -r1.535 http_main.c
> --- http_main.c 2001/04/12 17:49:26     1.535
> +++ http_main.c 2001/06/01 23:50:35
> @@ -2492,7 +2492,7 @@
>             waitret = waitpid(ocr->pid, &status, WNOHANG);
>             if (waitret == ocr->pid) {
>                 ocr->pid = -1;
> -               (*ocr->maintenance) (OC_REASON_DEATH, ocr->data, (ap_wait_t)status);
> +               (*ocr->maintenance) (OC_REASON_RESTART, ocr->data, (ap_wait_t)status);
>             }
>             else if (waitret == 0) {
>                 (*ocr->maintenance) (OC_REASON_RESTART, ocr->data, (ap_wait_t)-1);
> 

> >
> > The symptoms that I see on Solaris are thus:
> >
> > 1) The rotatelogs process for ErrorLog directive loses its parent
> >    after startup.  Hence, it's ppid is init.  I'm not sure how this
> >    plays in, but this doesn't look right.  I also can't recreate 2
> >    without having ErrorLog be a piped log.
> >

that seems bogus all right, but I think your fix for 2) will cure our 
immediate pain.

> > 2) On shutdown (via SIGTERM), a race condition occurs.
> >    reclaim_child_processes kills all of the httpd children (fine),
> >    but the second half of the r_c_p call is a bit odd.  Basically,
> >    it calls piped_log_maint with OC_REASON_DEATH - this triggers
> >    piped_log_spawn to start up a new child since pl->program isn't
> >    NULL.  This is completely and utterly wrong.  We're supposed to
> >    be shutting down, not starting up.  These new children never
> >    receive the SIGTERM - they will stick around until another
> >    SIGTERM occurs.  The old children have already quit due to the
> >    SIGTERM, but the piped_log_spawn starts up new rotatelogs
> >    processes.
> >

Mime
View raw message