httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Slemko <>
Subject Re: more lingering_close...
Date Sun, 09 Feb 1997 21:39:26 GMT
On Sun, 9 Feb 1997, Ben Laurie wrote:

> Jim Jagielski wrote:
> > 
> > Ben Laurie wrote:
> > > 
> > > Hang on. I understood up to this point (finally), but surely this is wrong?
> > > If the server isn't doing an l_c(), then it won't have half-closed at this
> > > point, it will have full-closed, and hence the RST's are exactly what is
> > > expected. In this case, we should definitely half-close (which is surely the
> > > point of l_c()?), or wait for the client to complete their send (which may
> > > lame - but since the spec doesn't give us a slot at this point in the protocol
> > > to say "OK" or "Oops", we shouldn't really be talking or closing connections
> > > yet).
> > > 
> > 
> > The problem that I see with waiting until the send is completed is
> > that it's either all or nothing... We must continue to wait until
> > we rec the AllClear from the client. If we have a timeout then we
> > are once again in the position where we close the link before
> > the "client is ready"; if we continue to wait, then for slow clients
> > or, even worse, clients that just hang there and never finish, we
> > tie up a process and never close the link. Even SO_LINGER in the
> > socket layer waits a certain amount of time (or, well, it _should_
> > wait).
> I think we need to be clear here. The scenario described is one where there is
> a client with a slow link doing a PUT which fails (because the URL [or some

No.  Not just a slow link.  The same thing happens over me loopback
interface, and that sure isn't a slow link.  (although its MTU of 16384
could be making it behave a little differently than expected here).

> other bit of header] is bad). The client may, as a result of the slow link,
> not get the error message, because, well, I'm not sure why. I thought I

You can't fully see it from my tcpdump because the sequence numbers on the
RSTs are treated as absolute ones while the ones on the preceding packets
are relative; ie. you can't match them up.  

To reiterate the key point in why it does not get the error message:

> > *** THIS IS IMPORTANT: *** When the client gets the RST, the RST
> > includes the sequence number of the last packet from the server
> > ACKed by the client at the time the RST was sent.  The client WILL
> > normally flush any buffered incoming data received from the server
> > after that sequence number.  This means that if the client is to
> > reliably get the entire error message to display, the server MUST
> > NOT send a RST until it has received an ACK of the last packet in
> > the error message it sends to the client.  Nothing the client
> > can do without modifying the TCP stack can change this, no matter what
> > it does with errors.

Once you understand that you will understand why.  To fully illustrate
that point would require tcpdumps from both the client and server side of
a connection to see just what each end knows when it sends each packet.
The sequence number included in the RST will most likely be one from
before the end of the error message, so everything after that will be

> understood, but I'm still digesting Marc's tcpdump and Stevens. I'm beginning
> to think that we've made a more fundemental error. An l_c() may fix it, but
> I'm not convinced that this isn't just hiding the true error.

You can argue that for the PUT case and I would be willing to agree for
the PUT case.  But there is no need to fix the PUT case (well, there are
other reasons but that's a different issue...) as long as we still have
the persistent connection case to worry about. 

> A timeout is a different matter. We should not l_c() a timed out connection -
> that is silly - we are aborting it, because it isn't behaving, so trying to be
> nice to it is self-destructive.

Not a connection which times out due to the "Timeout", no; we just abort
those with a close() (or a shutdown(sd,2) depending on if it is a hard
timeout or a soft one...). However when you get into the keepalive issue
we need to do a lingering_close on a keepalive timeout; keepalive timeouts
are normal behavior.

View raw message