hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Dever" <jdev...@nortelnetworks.com>
Subject RE: Patch pending on scanning for HTTP 1.1 - attached
Date Tue, 03 Dec 2002 00:24:32 GMT
Hey Eric,

This is great work.  I think most of us has been having connection close
problems.  Adrian has done a good job of evaluating this patch, please
consider his comments and resubmit.

I am planning on committing the httpclient/httpmulticlient merge patch
tonight, so you should line up your changes after that.

-jsd


-----Original Message-----
From: Adrian Sutton [mailto:adrian.sutton@ephox.com]
Sent: Monday, December 02, 2002 6:18 PM
To: 'Commons HttpClient Project'
Subject: RE: Patch pending on scanning for HTTP 1.1 - attached


Hi Eric,
I've been hitting these problems for a fair while now and have come to many
of the same conclusions as you have about how to solve them.  Side note: I'm
a fairly new comer around here but these issues have become something of a
pet project for me.

>The patch consists of the following:

    * AutoCloseInputStream now does not call "super.read()" once close
      has been called on the connection itself.  The close() function is
      now implemented to close the underlying _connection_ rather than
      just the socket input stream.  Failing to close the connection
      would cause an unnecessary (and sometimes troublesome) recoverable
      exception with subsequent requests.

Agreed, this is the cause of an annoying problem and would be a better way
to solve it than just retrying as I currently do.

    * ChunkedInputStream does not close the underlying stream anymore.
       It does, however, "exhaust" the remaining chunked stream when the
      "close()" method is called, eliminating _almost_ all of the need
      for HttpClient to scan for HTTP/1.1 lines (which is error prone,
      due to the unpredictability of the content of the underlying
      responses).  The "read" methods check to see if the connection is
      closed and throw an exception.

There was a change made just before I came on board with HttpClient so that
the close method of all input stream wrappers closed the underlying stream
so that it was compliant with the general contract for InputStream.close()
(ie: that it would close the stream).  This seems like an ideologically good
thing but does seem to cause strange problems with existing code,
effectively because it means that the close method should never be called
outside of HttpClient.

    * ContentLengthInputStream - similar changes to that of
      ChunkedInputStream.

Similar comments too. :)

    * The HeadMethod now calls shouldCloseConnection and
      connection.close() as appropriate.

A definite bonus.  We need to do a review of the merged HttpClient to ensure
that the expected calls are made in the right order etc.  I'll do up a
checklist of the things that can cause these problems once I manage to get
the release out the door for the product I'm paid to work on. :)  Feel free
to beat me to it though.

    * HttpClient.executeMethod() now calls InputStream.close() on the
      previous response's input stream (this is part that may be tricky
      to carry over to the MultiClient merge).  In this way, a client of
      a "Get" method, for example, need not actually read the entire
      response - calling close has the effect of doing so, or closing
      the underlying connection, as appropriate.

If I understand this correctly, this is a bad thing.  Http allows for keep
alive connections which absolutely require that the entire response be read,
no ifs or buts (otherwise you can't find the start of the next response).
By closing the input stream, you're effectively removing the ability for
keep alive connections thus adding all the HTTP overhead to every connection
and not complying to the HTTP 1.1 spec.

    * HttpConnection - now keeps track of the InputStream for the last
      executeMethod() request.

I assume this is so the above can be implemented?

    * HttpMethodBase - Now keeps track of the number of "recoverable
      exceptions" - so that I could write test cases that monitor
      whether recoverable exceptions are being thrown when they
      shouldn't be!  Also calls "close" on the input stream when done
      reading it, and sets the connection's last response input stream.

Very useful, but shouldn't call close IMO.  We need to add some
comprehensive tests for these issues to the test suite.

    * TestGetMethodLocal - added an additional test where the "GET"
      method response body is never read, but is followed immediately by
      another GET request.

Excellent, however from previous discussions I believe that this is expected
to fail with HttpClient because the user is expected to read the response
body every time.  I'm not sure I like that conclusion but I just tend to tow
the line around here. :)

    * TestHttpClientLocalHost - now does "GET" requests from
      "/httpclienttest/body" (based on system props), rather than "/" as
      the root of the server is not controlled by the standard WAR file.
       I was getting 302 responses, and thus test failures, until I
      pointed to the new location.
    * TestMethodsLocalHost - same as TestHttpClientLocalHost
    * TestMethodsNoHost - added new tests of the HeadMethod.

Good, good and good. :)

>Miscellaneous random notes:

    * I ran into my difficulties initially because I was using Tomcat
      4.1.10 with the old (now deprecated) HttpConnector (1.1
      compliant).  It turns out that it returns a "Connection: close"
      header on any request that fails to indicate content length either
      via a content-length header or a chunked encoding.  This sounds
      similar to someone else's posting about difficulties with the
      "Connection: close" header.

I'll have to check up on what the connection: close problems were, generally
I've found that when Connection: close is specified everything works
perfectly all the time, it's the Connection: keep-alive that can cause
problems.  Any hints about what the problems were?

    * I had subsequent difficulties due to sending the results of a GET
      request to Xerces for XML parsing.  Xerces would eventually call
      "close" on the stream passed to it.  This would have the
      unfortunate effect, when working with ContentLengthInputStream for
      example, of closing the underlying socket - without notifying the
      connection.  Now, the close changes the state of the
      ContentLengthInputStream (by exhausting the input stream), but
      does not close the socket.  Of course, if the AutoCloseInputStream
      wrapper gets used, the connection will be closed....

Yes, MultipartPostMethod did this as well and it caused problems.  Giving
the user the user the power to close the input stream can cause problems but
is a neccessary evil because there are valid reasons for the user closing
the stream (particularly if errors occur in the stream and you want to go
back to a known stable state).

The behaviour I would suggest for the close method of *every* input stream
that HttpClient uses is:

1. close the underlying stream so that we are compliant with the general
contract for InputStream.close()
2. close the connection so that we don't try and reuse it in the future
3. release the connection so that MultiClient can immediately create a new
connection and get on with any other pending requests.

With that behaviour the user would then be required to *either*, read the
entirety of the response or call close(), however calling close would be
discouraged as it prevents connection reuse.

>Enjoy.  Please let me know of any feedback you might have, I will be 
>happy to take another shot at revising.

Looks like it covers most of the situations that these errors occur in, the
only real issue is whether or not close() should close the underlying
stream.

>-Eric.

Adrian Sutton, Software Engineer
Ephox Corporation.

--
To unsubscribe, e-mail:
<mailto:commons-httpclient-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:commons-httpclient-dev-help@jakarta.apache.org>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message