httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Slemko <ma...@znep.com>
Subject Re: Apache 1.2b7-dev performance
Date Fri, 07 Feb 1997 06:26:58 GMT
On Thu, 6 Feb 1997, Dean Gaudet wrote:

> Do we have a list of OSs that properly support SO_LINGER?  I'm trying to

Not AFAIK.

> figure out how I could test that... 'cause I have to run with
> -DNO_LINGCLOSE on hotwired's IRIX 5.3 servers to avoid FIN_WAIT_2 death. 

Don't suppose you could throw a packet sniffer in and get a 
record of one of the connections that ends up (finally, after long
periods of time not just because of the client not doing a 
shutdown on their end for a while) in FIN_WAIT_2?  Didn't think so.
<sigh>

> I haven't tried SO_LINGER because I'm not sure how to discern between good
> and bad SO_LINGER. 

Enable it and see if the server crashes?  If it is broken, the 
server will crash in great flames.  Even worse than FIN_WAIT_2's killing
it.  <g>  Checking the setsockopt man page to see if they say 
something like "the timeout option for SO_LINGER doesn't actually
do anything" could perhaps give you some idea.

There may or may not be issues with some implementations not doing
everything that lingering_close() does to make things happy.

If you really want to know how IRIX does on them, you could possibly 
contact the below guy at SGI.

---------- Forwarded message ----------
Date: Thu, 30 Jan 1997 21:45:40 -0800
From: Steve Alexander <sca@refugee.engr.sgi.com>
To: Marc Slemko <marcs@znep.com>
Subject: Re: IRIX and FIN_WAIT_2 timeout (was Re: Frequency of RST terminated  connections)

Marc Slemko <marcs@znep.com> writes:
>Sorry for grabbing your name off end2end-interest, but I'm not sure who
>would be the one to talk to at SGI about this and since I don't have any
>boxes myself, trying to go through support would likely be futile...  If
>you are not the appropriate person, I would appreciate it if you could
>direct me to them or forward my message.

I am one of the appropriate people, so no problem.

>With Apache 1.2 betas, on many systems there are a lot of connections
>which get left hanging around in FIN_WAIT_2.  This appears to be due to a
>function added which simulates SO_LINGER (but that is supposed to actually
>work, as opposed to SO_LINGER which is very broken half the time).  This
>is necessary to properly handle thing such as sending error messages on
>PUTs and to handle various parts of persistent connections (aka.
>keepalives); if the socket is fully closed, the client will get a RST if
>we get more data before it knows about the close.  This breaks things, so
>to avoid that we have a lingering_close() function that does a half close
>and then sits there just throwing away anything it gets until either the
>client acknowledges the FIN or we timeout.

Yes, I have had several reports about this, all on systems running Apache.  I
don't understand what Apache could be doing that causes this, except shutting
the socket down for writing and not closing it.  If you closed it, the 2MSL
timer would start in FIN-WAIT-2 because SS_CANTRCVMORE would be set.

The BSD keepalive code, for no clear reason, did not time out connections in
states > CLOSING.

>For whatever reason, this results in a large number of connections left in
>the FIN_WAIT_2 state.  There are several bugs in popular clients and
>client TCP stacks that appear to be contributing to the problem, and
>possibly something in the server's TCP stack or the Apache code which is
>doing something wrong.

Can you tell me specifically what the lingering_close() function does?
Is it shutdown(s, 1)?

>In any case, the point is that we have been told by a SGI customer that
>they had asked SGI about adding a timeout to IRIX for the FIN_WAIT_2
>state, and SGI said they would not because it wasn't in the RFC, end of
>story.  Would you be able to confirm this?

Nobody ever talked to me about it.

In forthcoming rollup patches for 5.3, 6.2, and 6.3, I have changed the
keepalive timer so that it will time out connections in all states.  A patch
for 6.4 will be done after 6.4 officially releases.  I am not sure that this
is the best mechanism, but we have customers using Apache that have to reboot
their servers every two days, which is completely unacceptable to them, and to
me.

>Regardless of any bugs that may be present in Apache that are contributing
>to this, several clients _do_ have a tendency to leave connections in this
>state regardless of server.

I think what Apache is doing is goofy, but I don't understand why so many
clients don't either send a FIN or a RST.  Either one would cause IRIX to
terminate the connection properly.  Perhaps some clients don't retransmit
their FINs and we miss the first one?  I have no clue.

I have no operational experience with Apache, but I don't believe that this
problem has been seen with either Netscape or Zeus, FYI.

In the future, please feel free to contact me about other IRIX-specific issues.
I want to make sure that IRIX are an excellent platform for web serving,
regardless of the server being used.

Thanks,
-- Steve



Mime
View raw message