hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject RE: Performance issues in ChunkedInputStream
Date Wed, 11 Apr 2007 12:26:55 GMT
On Wed, 2007-04-11 at 08:02 -0400, Tony Thompson wrote:
> > I am having what I consider a fairly significant performance issue 
> > with a ChunkedInputStream in the 3.0.1 client.  I have been packet 
> > tracing a conversation using the HttpClient where the response is 
> > chunked and my application is reading the response stream directly 
> > from the method I executed.  On the wire, frequently during a large 
> > chunked response I see "ZeroWindow" responses from my client to the 
> > server which would indicate that the HttpClient is not getting the 
> > data off the wire fast enough.  I have tried making the 
> > BufferedInputStream in HttpConnection large (128K) and it still fills
> up (it just takes a little longer).
> >  
> > So, after doing some profiling of ChunkedInputStream I found that a 
> > huge amount of time is spent in 
> > ChunkedInputStream.getChunkSizeFromInputStream().  In the very short 
> > profiling session I ran, ChunkedInputStream.read( byte[], int, int ) 
> > was invoked 2424 times and the time spent in that method (excluding 
> > further method calls) was 870ms.  getChunkSizeFromInputStream() was 
> > invoked 432 times and the time spent in that method was 27762ms 
> > (excluding further method calls).  Does someone who understands that 
> > code better than I have any idea how that can be improved?
> >  
> >
> >Tony,
> >
> >I have spent a fair amount of time and efforts profiling HttpCore, a
> set of low level HTTP transport components HttpClient 4.0 will be 
> >based on.
> >Overall HttpClient 4.0 is expected to be 20 to 40% faster then
> HttpClient 3.x due to improvements in the core HTTP components. The sad 
> >truth is we simply lack resources to back-port those changes to
> HttpClient 3.x code line. 
> 
> Unfortunately I need to come up with some kind of fix now.  I am
> currently using this in an environment where this is causing lots of
> issues.  So, I guess that means I have to dig into that myself.  Any
> pointers on what I might be able to do to improve that particular piece
> of the code?
> 

I ended up rewriting it almost completely for HttpClient 4.0. One of the
problems I found is that in lots of places HttpClient 3.x reads one byte
at a time from the input stream in order to be able to detect a CRLF /
LF line delimiter, which may be one of the factors contributing to the
performance issue you have been having. I simply do not see an easy fix
for this problem. 

> > One other issue I have with that code is if I interrupt the file 
> > transfer and call method.abort(), that ChunkedInputStream appears to 
> > still keep pulling data from the host.  Wouldn't it just make more 
> > sense to just close down that connection instead of making it sit 
> > there and pull data that is just dumped into the bit bucket?
> >
> >This precisely what HttpMethod#abort() does. It simply shuts down the
> underlying connection. I have a hard time believing any data can 
> >be received after the connection socket has been closed. It is
> plausible, though, some data may still be read from an intermediate 
> >content buffer, but I find this scenario unlikely.
> 
> You may want to take a look at that in the new client.  In the 3.0.1
> client, after that stream is closed, exhaustInputStream() is called
> which attempts to finish reading the content so the connection can be
> ready for another request. 

I do not think this is the case. #exhaustInputStream() is called ONLY if
the connection is being released back to the connection manager.
HttpMethod#abort() simply calls HttpConnection#close(), which in its
turn just closes down the underlying network socket without trying to
exhaust the input stream. 

http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpMethodBase.html#1102
http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpConnection.html#1214

Hope this helps

Oleg


>  In my case, that is a lot of content and so
> it continues on for a bit before the socket is just reset.  I am not
> continuing to read in my application but looking at a packet trace I can
> see the client is doing it for me.
> 
> Thanks
> Tony
>  
> This message (and any associated files) is intended only for the 
> use of the individual or entity to which it is addressed and may 
> contain information that is confidential, subject to copyright or
> constitutes a trade secret. If you are not the intended recipient 
> you are hereby notified that any dissemination, copying or 
> distribution of this message, or files associated with this message, 
> is strictly prohibited. If you have received this message in error, 
> please notify us immediately by replying to the message and deleting 
> it from your computer. Messages sent to and from Stoneware, Inc.
> may be monitored.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Mime
View raw message