hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tony Thompson" <Tony.Thomp...@stone-ware.com>
Subject RE: Performance issues in ChunkedInputStream
Date Wed, 11 Apr 2007 17:29:28 GMT
>Hello Tony,
>> I
>> have tried making the BufferedInputStream in HttpConnection large 
>> (128K) and it still fills up (it just takes a little longer).
>> [...]
>> One other issue I have with that code is if I interrupt the file 
>> transfer and call method.abort(), that ChunkedInputStream appears to 
>> still keep pulling data from the host.
>As Oleg pointed out, the socket and it's I/O streams are closed by
method.abort(). The streams residing on top of that will not be closed.
>The ChunkedInputStream sits on top of the BufferedInputStream that was
created by the connection. As long as there is data in the buffer, 
>the stream will continue to work. Only when the buffer is exhausted, a
new call to the socket stream is made. That call will then result in 
>an IOException indicating that the socket has been closed. In other
words, CIS is pulling data from the buffer, not from the host.
I can understand that but by watching a packet trace, that doesn't seem
to be what is happening.  After I call abort() I can still observe
packets flowing on the wire so someone is still requesting data from the
host.  I didn't try to dig far enough into the HttpClient code to say
for sure what is still going on.  I can only tell you what I am
observing at this point.

>I wonder how the BufferedInputStream can fill up. If it is empty, a
single attempt will be made to read data from the underlying socket 
>stream. Then, only the buffered data is accessed until that is
exhausted. Does your OS buffer enough data on it's own to fill a 128 K 
>buffer in a single read operation? If it does, I wonder even more why
parsing the chunk header should be a performance bottleneck. 
>Is there suspicious GC activity?

As a matter of fact, the window size in my particular test is only 16k
so no, I wouldn't be able to fill up the 128k buffer in a single read.
I added some printlns to the CIS.read() method to see how much data is
available on the InputStream before I read it and what I observed was
the amount of data that was available would grow and shrink (back down
to 0 sometimes) but I could get that buffer to the point where it was
full.  I looked at the source for BufferedInputStream and it looks like
it tries to fill the empty space in the buffer each time you read from
it (for a socket connection it will read more than one packet of data)
instead of just doing a single read from the underlying stream.  So, if
the host is shoving data back fast enough, that buffer could fill up if
the CIS is not pulling data out of it fast enough which seems to be
where the bottleneck is.  The longer it takes for the CIS to process
data from the buffered stream means everything waits around for another
read on the buffered stream (which is what triggers a read on the socket

This message (and any associated files) is intended only for the 
use of the individual or entity to which it is addressed and may 
contain information that is confidential, subject to copyright or
constitutes a trade secret. If you are not the intended recipient 
you are hereby notified that any dissemination, copying or 
distribution of this message, or files associated with this message, 
is strictly prohibited. If you have received this message in error, 
please notify us immediately by replying to the message and deleting 
it from your computer. Messages sent to and from Stoneware, Inc.
may be monitored.

To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org

View raw message