httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Lewis <...@joe-lewis.com>
Subject Re: Debugging: child process 14446 still did not exit, sending a SIGTERM
Date Fri, 16 Oct 2009 19:37:26 GMT
Michael B Allen wrote:
> On Fri, Oct 16, 2009 at 2:42 PM, Joe Lewis <joe@joe-lewis.com> wrote:
>   
>> Michael B Allen wrote:
>>     
>>> On Fri, Oct 16, 2009 at 1:10 PM, Joe Lewis <joe@joe-lewis.com> wrote:
>>>
>>>       
>>>> Michael B Allen wrote:
>>>>
>>>>         
>>>>> I have a customer who very occasionally sees apache workers hang. I'm
>>>>> pretty sure this is caused by an errant module but I don't know which
>>>>> one.
>>>>>
>>>>> Is there any way to determine which module is causing Apache workers
to
>>>>> hang?
>>>>>
>>>>> Can I temporarily disable that SIGTERM so that I can have enough time
>>>>> to attach GDB to the hanging processes?
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>>           
>>>> Perhaps run it in a non-forking mode (httpd -X -k start) inside of gdb
>>>> and
>>>> see what it hangs on?
>>>>
>>>>         
>>> If I run it in gdb like you suggest:
>>>
>>>  # gdb httpd
>>>  (gdb) run -X -k start
>>>
>>> I cannot get httpd to run module deinitialization. Meaning if I do
>>> apachectl stop or httpd -X -k stop or graceful-stop in another
>>> terminal, it just kills the whole process group. Since the problem is
>>> hanging during module deinitialization I don't think this is going to
>>> help me. How do I shutdown httpd so that it runs the module
>>> deinitialization routines?
>>>
>>> Otherwise does anyone have a web-svn pointer to the code that's
>>> calling the SIGTERM? Maybe I can find a way to disable it.
>>>
>>> Mike
>>>
>>>       
>> Disabling SIGTERM for apache would be akin to leaving the landing gear of
>> your airplane on the ground when you take off.  How are you going to
>> properly shutdown apache if you completely kill the SIGTERM signals?
>>     
>
> SIGTERM should not be used to stop processes. A process should
> complete gracefully and call exit(2). Normally, this is what httpd
> does. However if a child process takes too long, something is sending
> a SIGTERM to *kill* the process. I assume this is Apache since it's
> writing a message in error_log to that effect. This is what I want to
> disable. Meaning, if a child process hangs, I want it to just sit
> there stuck forever until an operator can login and attach gdb to it.
>
> If I could find that part of the code, I might find a directive that
> controls how long Apache waits before it sends the SIGTERM.
>
>   
>> The "deinitialization" - are you just not seeing the messages you'd normally
>> see?  Or did apache just terminate (which is normal in gdb, which causes the
>> gdb session to terminate as well).
>>     
>
> Right. I have an Apache module that writes to a separate log. When the
> module is deinitialized, information is written to the log. Without
> gdb, that information is correctly written to the log. When running in
> gdb, nothing is written to the log. It seems the entire process group
> is simply being killed. And thus the part of interest is not
> accessible.
>
> Mike
>   

The SIGTERMS are occurring because apache has already attempted to stop 
a process gracefully, and it isn't stopping.  Rather than endlessly try 
and "gracefully" shutdown a child process, apache will presume that the 
process is just not going to respond.

You can always try the worker MPM rather than the prefork MPM.

As it stands, from the sound of the problem and the rarity of it (your 
previous descriptions), you are going to be "hit and miss" on tracking 
it down.  You could potentially recompile all of the modules and apache 
itself (placing debug log lines in each one), but the problems may 
actually go away in that case.  Especially if you switch versions.

I do know that some distributions' versions of apache exhibited behavior 
similar to what you have described (specifically, SuSE), so I don't know 
if compiling a new version would alleviate the customer gripe.

I only have two real suggestions : strace the processes, and hope the 
hard drive is big enough to capture the output from strace until the 
problems are encountered, or try upgrading the version of Apache.

Joe
-- 
Joe Lewis
Chief Nerd 	SILVERHAWK <http://www.silverhawk.net/> 	

------------------------------------------------------------------------
/With every passing hour our solar system comes forty-three thousand 
miles closer to globular cluster 13 in the constellation Hercules, and 
still there are some misfits who continue to insist that there is no 
such thing as progress.
    --Ransom K. Ferm/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message