hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "St Jacques, Robert" <RStJacq...@crt.xerox.com>
Subject RE: streaming responses
Date Wed, 29 Sep 2004 20:01:32 GMT
Sure enough, this is the case; I was calling "getResponseBody" deep in my
code.  Thanks for helping me find it ;)

-----Original Message-----
From: Oleg Kalnichevski [mailto:olegk@apache.org] 
Sent: Wednesday, September 29, 2004 3:56 PM
To: Commons HttpClient Project
Subject: RE: streaming responses


Bob,

"Buffering response body" message is written ONLY inside getResponseBody,
which in its turn uses getResponseBodyAsStream. See for
yourself:

http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#681

It means what there's a bit of code in your application that calls
getResponseBody or getResponseBodyAsString methods, which causes the
buffering of the response body.

getResponseBodyAsStream does reconstruct the input stream in case the
response has been already buffered, but per default it returns the raw input
stream, when available. Feel free to examine the source:

http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#709

What would HttpClient be worth if it always buffered the response body?

I have no idea what led you to believe that "processResponseHeaders" method
buffers the response. The method is even called processResponseHeaders not
processResponseBody or something

Here's what processResponseBody method does:

(1) Gets the raw input stream from the connection
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#2014

(2) If the chunk-encoding is used, wraps it with ChunkedInputStream. No
buffering
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#2038

(3) If Content-Length header is used, wraps it with
ContentLengthInputStream, which basically ensures that content will not be
read past its length. No buffering
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#2072

(4) If there's anything wrong with either Transfer-Encoding or
Content-Length header, it just leaves the raw input stream alone. No
buffering

(5) Finally it attaches a AutoCloseInputStream to the resultant input
stream, which is intended to close the underlying connection once the entire
response is consumed. No buffering.

(6) Keeps the resultant input stream.
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#1979

(7) Now, if getResponseBodyAsStream is called, it will simply return the
input stream without any further manipulations:
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpcli
ent/HttpMethodBase.html#710

(8) End of story

Hope this clarifies things a bit

Oleg

On Wed, 2004-09-29 at 21:17, St Jacques, Robert wrote:
> Oleg,
> 
> This is exactly what I tried to do, but the HttpClient seems to buffer 
> the response anyway.  I poked through the code a bit, and did some 
> searching on the mailing list, and this seems to be the case.  Also, 
> when I run tests, the log output from the HttpClient seems to indicate 
> that the request is being buffered before I even get an opportunity to 
> call "getResponseBodyAsStream".  I tried this on a request that is 
> about 17 MB and just sat back and watched as the response was 
> buffered; I ended up terminating the program after a minute or two 
> without ever actually getting past the execute method call.  Here is 
> the output from the logs:
> 
>   enter HttpMethodBase.processResponseHeaders(HttpState, HttpConnection)
>   enter GetMethod.readResponseBody(HttpState, HttpConnection)
>   enter HttpMethodBase.readResponseBody(HttpState, HttpConnection)
>   enter HttpMethodBase.readResponseBody(HttpState, HttpConnection)
>   enter HttpConnection.getResponseInputStream()
>   Buffering response body
> <lots of logs detailing the bytes read>
> 
> >From here it looks like the "processResponseHeaders" method (which is 
> >called
> by the HttpMethodBase during an execute) automatically buffers the 
> response. Is this true, or am I totally misreading the signs?
> 
> Thanks,
> Bob
> 
> -----Original Message-----
> From: Oleg Kalnichevski [mailto:olegk@apache.org]
> Sent: Wednesday, September 29, 2004 3:04 PM
> To: Commons HttpClient Project
> Subject: Re: streaming responses
> 
> 
> Bob,
> 
> There's no special magic involved. Make sure you use 
> HttpMethod#getResponseBodyAsStream, which will return the raw input 
> stream, and not its buffering counterparts 
> HttpMethod#getResponseBodyAsString and HttpMethod#getResponseBody.
> 
> You should not worry about chunking. HttpClient will decode chunked 
> input on fly, if necessary. Just grab the raw input stream and do 
> whatever response retrieval suits your application best.
> 
> Hope this helps
> 
> Oleg
> 
> 
> On Wed, 2004-09-29 at 20:43, St Jacques, Robert wrote:
> > Howdy,
> > 
> > I've just started using HttpClient for testing.  The product that I 
> > am
> > developing includes a software download feature that downloads 
> > (sometimes very large) files over the internet using HTTP.  The reason 
> > we use HTTP is that many of our customers are unwilling to open their 
> > proxies or firewalls to other protocols.
> > 
> > In order to accommodate these downloads, we must either be able to
> > stream the response to a GET, or we need to be able to insure that the 
> > response we get from the server will be chunked in such a way that the 
> > chunks are small enough to buffer.  I know that HttpClient supports 
> > chunking, but I am unable to 'force' our web server to chunk up the 
> > data.  Furthermore, the HTTP 1.1 RFC indicates that the server does 
> > not need to guarantee that a response, even if chunked, will be small 
> > enough to conveniently buffer on the client.
> > 
> > What is the best way to use HttpClient to retrieve large amounts of
> > data (as large as 200 MB or more)?
> > 
> > Thanks,
> > Bob St. Jacques
> > Member Research Technical Staff
> > Xerox Corp.
> > (585)231-8306
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: 
> commons-httpclient-dev-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-httpclient-dev-help@jakarta.apache.org


Mime
View raw message