hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Galambos <l...@centrum.cz>
Subject Connection break
Date Mon, 04 Aug 2003 16:31:15 GMT
Hi.

I write a robot for a search engine. The robot must harvest all files 
which are shorter than a few kilobytes (let's say 100kB) - longer  files 
are not important, because they are often archives or long sheets about 
nothing.

I cannot find a robust style in which I could drop a connection (GET 
over HTTP/1.0 and HTTP/1.1) when the incoming data stream exceeds the 
upper limit. I do it by closing the input stream, which is constructed 
by getResponseAsStream, followed by releaseConnection. Is it OK?

My second point is related to "retrying" you have in your docs 
(http://jakarta.apache.org/commons/httpclient/tutorial.html - catch 
block of HttpRecovableException). When I do something like this, I found 
out that I had to call method.recycle() in the catch block, or the 
connection was not reinitialized and everything fails. Could you 
enlighten me on this? Is it a bug in the guide? (I have tried it on 2.0-b1).

And my last point - when I run the robot under stress conditions, some 
connections seem to be frozen, although I use setConnectionTimeout. Is 
it a known issue? How should I debug it so that you can get a valuable 
log? It happens after 1-2 hours of run, so the log could have a few gigas...

Thank you

-g-



Mime
View raw message