httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <rainer.j...@kippdata.de>
Subject Re: Unclean process shutdown in event MPM?
Date Sun, 25 Apr 2010 18:07:34 GMT
On 23.03.2010 15:30, Jeff Trawick wrote:
> On Tue, Mar 23, 2010 at 10:04 AM, Rainer Jung<rainer.jung@kippdata.de>  wrote:
>> On 23.03.2010 13:34, Jeff Trawick wrote:
>>>
>>> On Tue, Mar 23, 2010 at 7:19 AM, Rainer Jung<rainer.jung@kippdata.de>
>>>   wrote:
>>>>
>>>> I can currently reproduce the following problem with 2.2.15 event MPM
>>>> under
>>>> high load:
>>>>
>>>> When an httpd child process gets closed due to the max spare threads rule
>>>> and it holds established client connections for which it has fully
>>>> received
>>>> a keep alive request, but not yet send any part of the response, it will
>>>> simply close that connection.
>>>>
>>>> Is that expected behaviour? It doesn't seem reproducible for the worker
>>>> MPM.
>>>> The behaviour has been observed using extreme spare rules in order to
>>>> make
>>>> processes shut down often, but it still seems not right.
>>>
>>> Is this the currently-unhandled situation discussed in this thread?
>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/httpd-dev/200711.mbox/%3Ccc67648e0711130530h45c2a28ctcd743b2160e22914@mail.gmail.com%3E
>>>
>>> Perhaps Event's special handling for keepalive connections results in
>>> the window being encountered more often?
>>
>> I'd say yes. I know from the packet trace, that the previous response on the
>> same connection got "Connection: Keep-Alive". But from the time gap of about
>> 0.5 seconds between receving the next request and sending the FIN, I guess,
>> that the child was not already in the process of shutting down, when the
>> previous "Connection: Keep-Alive" response was send.
>>
>> So for me the question is: if the web server already acknowledged the next
>> request (in our case it's a GET request, and a TCP ACK), should it wait with
>> shutting down the child until the request has been processed and the
>> response has been send (and in this case "Connetion: Close" was included)?
>
> Since the ACK is out of our control, that situation is potentially
> within the race condition.
>
>>
>> For the connections which do not have another request pending, I see no
>> problem in closing them - although there could be a race condition. When
>> there's a race (client sends next request while server sends FIN), the
>> client doesn't expect the server to handle the request (it can always happen
>> when a Keep Alive connection times out). In the situation observed it is
>> annoying, that the server already accepted the next request and nevertheless
>> closes the connection without handling the request.
>
> All we can know is whether or not the socket is readable at the point
> where we want to gracefully exit the process.  In keepalive state we'd
> wait for {timeout, readability, shutdown-event}, and if readable at
> wakeup then try to process it unless
> !c->base_server->keep_alive_while_exiting&&
> ap_graceful_stop_signalled().
>
>> I will do some testing around your patch
>>
>> http://people.apache.org/~trawick/keepalive.txt
>
> I don't think the patch will cover Event.  It modifies
> ap_process_http_connection(); ap_process_http_async_connection() is
> used with Event unless there are "clogging input filters."  I guess
> the analogous point of processing is inside Event itself.
>
> I guess if KeepAliveWhileExiting is enabled (whoops, that's
> vhost-specific) then Event would have substantially different shutdown
> logic.

I could now take a second look at it. Directly porting your patch to 
trunk and event is straightforward. There remains a hard problem though: 
the listener thread has a big loop of type

     while (!listener_may_exit) {
         apr_pollset_poll(...)
         while (HANDLE_EVENTS) {
             if (READABLE_SOCKET)
                 ...
             else if (ACCEPT)
                 ...
         }
         HANDLE_KEEPALIVE_TIMEOUTS
         HANDLE_WRITE_COMPLETION_TIMEOUTS
     }

Obviously, if we want to respect any previously retunred "Connection: 
Keep-Alive" headers, we can't terminate the loop on listeners_may_exit. 
As a first try, I switched to:

     while (1) {
         if (listener_may_exit)
             ap_close_listeners();
         apr_pollset_poll(...);
         REMOVE_LISTENERS_FROM_POLLSET
         while (HANDLE_EVENTS) {
             if (READABLE_SOCKET)
                 ...
             else if (ACCEPT)
                 ...
         }
         HANDLE_KEEPALIVE_TIMEOUTS
         HANDLE_WRITE_COMPLETION_TIMEOUTS
     }

Now the listeners get closed and in combination with your patch the 
connections will not be dropped, but instead will receive a "Connection: 
close" during the next request.

Now the while-loop lacks a correct break criterium. It would need to 
stop, when the pollset is empty (listeners were removed, other 
connections were closed due to end of Keep-Alive or timeout). 
Unfortunately there is no API function for checking whether there are 
still sockets in the pollset and it isn't straightforward how to do that.

Another possibility would be to wait for a maximum of the vhost 
keepalive timeouts. But that seems to be a bit to much.

Any ideas or comments?

Regards,

Rainer

Mime
View raw message