commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julius Davies" <>
Subject RE: [httpclient] how to properly set the timeout + more
Date Sun, 21 May 2006 19:01:10 GMT
Hi, Mikael,

In my experience timeouts are usually only an issue once in a while (e.g. one out of every
10,000 requests might timeout).  If you can't get your connection to ever work, probably it's
not timeout-related (but I could be wrong).

For data-sets that large, I would save to disk first, and then start the processing using
the local copy.  Do something like this:

InputStream in = method.getResponseAsStream();
OutputStream out = new FileOutputStream( "/path/to/local/file" );
byte[] buf = new byte[ 4096 ];

// read 4KB chunks of data from the stream, and immediately
// write them to disk
int count = buf );
while ( count >= 0 )
  if ( count > 0 )
    out.write( buf, 0, count );
  count = buf );

Now that your large data-set is safely stored to local disk, you can use "new FileInputStream(
"/path/to/local/file" )" to finish your processing.


Julius Davies

-----Original Message-----
From:	Mikael Andersson []
Sent:	Fri 5/19/2006 7:37 AM
Subject:	[httpclient] how to properly set the timeout + more

I am running into some issues when using the HttpClient to retrieve quite
large datasets (20-30) MB. I am thinking that it may have something to do
with timeout settings.

When accessing a resource which does a fair bit of processing the data
before returning it, the transfer fails after some 13MB. But when accessing
the same data but specifying that I want the raw data which it streams back
faster, I can retrieve the whole data set.

I am retrieving data which contains several pieces of data in one go, and I
know that data streamed back comes in chunks. The remote resource takes the
first piece of data and annotates it with html and returns it, and then the
next and so on.

The mimetype for the HTML annotated data is also text/html, for the raw data
it is text/plain.

The way I specify the time out now is like this:
private static long TIME_OUT = 30000;

MultiThreadedHttpConnectionManager connManager = new
HttpClientParams httpClientParams = new HttpClientParams();
httpClientParams.setConnectionManagerTimeout( TIME_OUT );
httpClient = new HttpClient( httpClientParams, connManager );


HttpMethodParams methodParams = new HttpMethodParams();
methodParams.setSoTimeout( (int)TIME_OUT );
getMethod = new GetMethod( urlStr );
getMethod.setParams( methodParams );
DefaultHttpMethodRetryHandler(3, false));

I get the following Exception:
java.lang.ArrayIndexOutOfBoundsException: 3000
        at org.mortbay.util.ByteArrayOutputStream2.writeUnchecked(
        at org.mortbay.jetty.HttpGenerator$OutputWriter.write(
        at javax.servlet.http.HttpServlet.service(
        at javax.servlet.http.HttpServlet.service(
        at org.mortbay.jetty.servlet.ServletHolder.handle(
        at org.mortbay.jetty.servlet.ServletHandler.handle(
        at org.mortbay.jetty.servlet.SessionHandler.handle(
        at org.mortbay.jetty.handler.ContextHandler.handle(
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(
        at org.mortbay.jetty.handler.HandlerCollection.handle(
        at org.mortbay.jetty.handler.HandlerWrapper.handle(
        at org.mortbay.jetty.Server.handle(
        at org.mortbay.jetty.HttpConnection.doHandler(
        at org.mortbay.jetty.HttpConnection.access$1500(
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(
        at org.mortbay.jetty.HttpParser.parseNext(
        at org.mortbay.jetty.HttpParser.parseAvailable(
        at org.mortbay.jetty.HttpConnection.handle(
        at org.mortbay.jetty.nio.SelectChannelConnector$
        at org.mortbay.thread.BoundedThreadPool$

Help would be extremely appreciated :)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message