Return-Path: Mailing-List: contact commons-httpclient-dev-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list commons-httpclient-dev@jakarta.apache.org Received: (qmail 11144 invoked from network); 4 Aug 2003 16:34:49 -0000 Received: from smtp-out4.iol.cz (194.228.2.92) by daedalus.apache.org with SMTP; 4 Aug 2003 16:34:49 -0000 Received: from fw.shark (gprs7-145.eurotel.cz [160.218.192.145]) by smtp-out4.iol.cz (Internet on Line ESMTP server) with ESMTP id 713492FE85 for ; Mon, 4 Aug 2003 18:34:35 +0200 (CEST) Received: from centrum.cz (0-253.shark [192.168.0.253]) by fw.shark (8.12.8/8.12.5) with ESMTP id h74GXsUw013125 for ; Mon, 4 Aug 2003 18:35:21 +0200 Message-ID: <3F2E8A53.2030307@centrum.cz> Date: Mon, 04 Aug 2003 18:31:15 +0200 From: Leo Galambos User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en MIME-Version: 1.0 To: commons-httpclient-dev@jakarta.apache.org Subject: Connection break X-Enigmail-Version: 0.76.3.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi. I write a robot for a search engine. The robot must harvest all files which are shorter than a few kilobytes (let's say 100kB) - longer files are not important, because they are often archives or long sheets about nothing. I cannot find a robust style in which I could drop a connection (GET over HTTP/1.0 and HTTP/1.1) when the incoming data stream exceeds the upper limit. I do it by closing the input stream, which is constructed by getResponseAsStream, followed by releaseConnection. Is it OK? My second point is related to "retrying" you have in your docs (http://jakarta.apache.org/commons/httpclient/tutorial.html - catch block of HttpRecovableException). When I do something like this, I found out that I had to call method.recycle() in the catch block, or the connection was not reinitialized and everything fails. Could you enlighten me on this? Is it a bug in the guide? (I have tried it on 2.0-b1). And my last point - when I run the robot under stress conditions, some connections seem to be frozen, although I use setConnectionTimeout. Is it a known issue? How should I debug it so that you can get a valuable log? It happens after 1-2 hours of run, so the log could have a few gigas... Thank you -g-