commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Saegesser <>
Subject RE: [httpclient] Lots of patches and discussion
Date Thu, 14 Feb 2002 14:46:36 GMT
Some clarification.

The patches I sent were of two types.  Fixes to bugs I encountered when I
tried to use HttpClient and new classes that add new functionality.  The bug
fixes should go in prior to a release, they fix real bugs (and I'm
reasonably confident didn't create new ones), the other stuff might be more
appropriate on a separate branch on in a proposals directory.  That's what I
want to discuss, where does it belong.  That's a decision for the
developer's community.

I think you misunderstood my user feedback comments.  I'm not talking about
asking the 'user community' what they think.  I'm talking about HttpClient
code needing to ask a 'user' for instructions.  Here's an example,
RFC2616/10.3.2 states that a 301 result in response to a POST request MUST
NOT be automatically followed unless it can be confirmed by the user.
HttpClient doesn't currently have any mechanism to ask the user for
confirmation and that's what I want to add.  I'm not talking about adding
any UI stuff, just some interfaces that developers can implement if they
want to be notified when there's a need for user confirmation.  They can
implement any kind of UI or whatever they want.

I'm not sure what you're trying to figure out how to turn on or off.  The
HttpUrlMethod stuff has *no* impact on any existing code (well, except that
I added getPath() to URIUTil becuase I couldn't find any more appropriate
place to put it).  HttpUrlMethod extends the HttpMethod interface and all
the URL*Method classes extend their respective HttpMethod implementation
classes.  You either use HttpClient and the HttpMethod classes or
HttpMultiClient and the HttpUrlMethod classes.  

Marc Saegesser 

> -----Original Message-----
> From: dIon Gillard []
> Sent: Wednesday, February 13, 2002 9:28 PM
> To: Jakarta Commons Developers List
> Subject: Re: [httpclient] Lots of patches and discussion
> Marc Saegesser wrote:
> [snip]
> >Now a question.  What is the status of the HttpClient 2.0 
> release?  The code
> >is currently tagged alpha 1 but the RELEASE_PLAN_2_0.txt 
> document hasn't
> >been modified since October, 2001.  I ask because, depending 
> on how iminent
> >an actual release is, some of the changes that I'm proposing 
> should probably
> >be made on a separate branch.  
> >
> It's waiting on the committer's being comfortable that it's 
> ready. I've 
> been doing mainly maintenance on httpclient recently, so i'm not the 
> best one to decide when it's ready to go.
> >Here's my story.  I have need of something like HttpClient 
> in my product but
> >I found that I had to extend it somewhat.  The extensions 
> are very generic
> >and I believe useful to others so I'd like to add to the 
> HttpClient project.
> >I also found several bugs that I fixed along the way.  I've 
> documented these
> >changes below.
> >
> Cool.
> >I need to be able to use HttpClient (or a derivative) to 
> navigate around the
> >web pretty much like a regular user-agent.  I want to be 
> able to access any
> >site and any web application that I can reach with a 
> reasonably modern
> >browser.  HttpClient does a good job of implementing the 
> client side of RFC
> >2616.  Unfortunately, there are lots of sites and some very big name
> >applications that do not implement the server side 
> correctly.  Some sites
> >(Yahoo! in particular) actually require a broken client 
> implementation just
> >to log in.  Here are two examples of things I've found so far.
> >RFC2616/10.3.3 forbids changing a 302 redirected POST method 
> into a GET
> >method but acknowledges that most clients are broken in this 
> regard (this is
> >the failure that Yahoo! requires).  I have found sites that 
> send relative
> >URLs in the Location: header of a redirect (this violates 
> RFC2616/14.30).
> >Supporting these sites will require 'breaking' HttpClient.  
> I propose adding
> >some kind of flag to put HttpClient into a 'compatability mode' that
> >impelements this and any other required broken behaviour.
> >
> This sounds like a great idea
> >A second need is to provide a mechanism for getting user 
> acknowledgment for
> >certain actions.  For exampe when redirecting from secure to 
> non-secure
> >sites.
> >
> >I am going to start working on these changes next but I want 
> to discuss them
> >with the HttpClient community so see if they feel they 
> belong in the commons
> >HttpClient project or if the project should be forked.
> >
> You've emailed the development community. I'm not sure many of the 
> 'user' community hang out here. My preference in this one is that it 
> belongs in httpclient as a strict vs relaxed mode.
> >Anyway, below is a description of the modified and new 
> files.  The patches
> >and new files are attached.
> >
> >Modified files...
> >
> >
> >  -  Added support for old Netscape cookies.  The biggest 
> difference is that
> >the test for valid domains is different for Netscape cookies 
> and RFC 2109
> >cookies
> >  -  Added space after the semicolons separating the values.  This is
> >required by sites that only implement the old Netscape 
> cookie specification.
> >  -  Added additional date format for expiration times.
> >
> >
> >  -  The write*() and print*() methods now throw 
> HttpRecoverableException.  
> >
> >
> >  -  Added a new exception class, HttpRecoverableException.  
> There are some
> >error conditions that we can try to recover from internally. 
>  The biggest
> >one I found was when a server unexepectedly closed the 
> socket.  In this case
> >we should just try to re-open the connection and try the 
> request again.
> >  -  Fixed a problem with the handling of 100 status codes.  
> If we get a 100
> >after we've already sent the request body, RFC 2616 states 
> that the response
> >should be ignored.  The currently implementation incorrectly 
> broke out of
> >the loop looking for the response.
> >
> This last one sounds like a bug that should be fixed anyway.
> >
> >  -  Always recreate the cookie header.  A redirect response may have
> >included additional cookies that we need to send with the 
> redirected request
> >and the path may have changed thus requiring a different cookie set.
> >
> Ditto.
> >
> >  -  Fixed readRequestBody implementation.  A new version of 
> this function
> >also takes an output stream.  This makes it easier for 
> subclasses to use
> >this implementation directly instead of having to 
> re-implement it in order
> >to support things like saving the response to a file.
> >  -  Better support for responses that don't contain a 
> Content-Length or
> >Transfer-Encoding header.  By the specification, if these 
> headers are both
> >absent, the response has no body content.  In the real world 
> what this means
> >is that the server probably didn't know the length when the 
> response was
> >committed.  It just sends the response and closes the 
> connection when the
> >body is complete.  This assumption falls apart when we get a 
> response that
> >*can not* contain a body.  In this case, the simple 
> implemenation keeps
> >reading looking for a response body and actually ends up 
> reading the next
> >response headers as the body.  I've added a list of responses that,
> >according to the specification, can not ever have a body and fixed
> >readResponseBody() to not read a body for these responses.
> >
> Again, sounds like another bug.
> >
> >  -  Added getPath() method.  This method returns the path 
> portion of a
> >given URL.  The only difference from 
> is that this
> >method returns "/" if the URLs path is empty.
> >
> >
> >  -  Switched to new HttpMethodBase.readResponseBody().
> >
> >New files...
> >
> >
> >  -  Replacement for HttpClient.  This class serves two 
> purposes.  First it
> >handles off-site redirects.  Second, it is intended to be 
> used within a
> >multithreaded application that, like a browser, may have 
> more than one
> >request outstanding to a given server and have requests 
> going to more than
> >one server.
> >  -  Since HttpMultiClient, unlike HttpClient, simultaneously handles
> >requesets for multiple servers it can't use HttpMethod 
> classes directly
> >because they only include path information, not server 
> information.  A new
> >interface, HttpUtlMethod, is used that extends HttpMethod.
> >
> >
> >  -  A simple wrapper around HttpState to synchronized 
> access to data.  This
> >is required to support the multi-threaded nature of HttpMultiClient.
> >
> >
> >  -  This is actually the heart of HttpMultiClient.  It 
> keeps track of
> >available HttpConnections for host:port combinations.  The number of
> >connections to a given host:port is limited (per RFC 2616) 
> and if the limit
> >is reached calls to getConnection() will block until a 
> connection becomes
> >available.
> >
> >
> >  -  Extends HttpException.  This exception is thrown when a 
> potentially
> >recoverable error has occurred (e.g. a socket connection was closed
> >unexpectedly).  Higher level code can attempt to try the 
> operation again.
> >
> >
> >  -  An interface that extends HttpMethod.  HttpUrlMethod classes are
> >initialized with a fully qualified URL instead of just the 
> path component.
> >
> >
> >
> >
> >
> >
> >  -  These classes exetend their respective method classes 
> and impelement
> >HttpUrlMethod.
> >
> >Marc Saegesser 
> >
> These all sound like good additions. What I think we need to 
> work out is 
> how do we turn this on or off?
> -- 
> dIon Gillard, Multitask Consulting
> --
> To unsubscribe, e-mail:   
> <>
> For additional commands, e-mail: 
> <>

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message