Return-Path: Delivered-To: apmail-jakarta-httpclient-dev-archive@www.apache.org Received: (qmail 31557 invoked from network); 16 May 2006 22:24:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 16 May 2006 22:24:36 -0000 Received: (qmail 42799 invoked by uid 500); 16 May 2006 22:24:23 -0000 Delivered-To: apmail-jakarta-httpclient-dev-archive@jakarta.apache.org Received: (qmail 42564 invoked by uid 500); 16 May 2006 22:24:22 -0000 Mailing-List: contact httpclient-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "HttpClient Project" Reply-To: "HttpClient Project" Delivered-To: mailing list httpclient-dev@jakarta.apache.org Received: (qmail 42537 invoked by uid 99); 16 May 2006 22:24:21 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2006 15:24:21 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [207.241.227.188] (HELO mail.archive.org) (207.241.227.188) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2006 15:24:20 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.archive.org (Postfix) with ESMTP id 1615F1414F61A for ; Tue, 16 May 2006 15:24:00 -0700 (PDT) Received: from mail.archive.org ([127.0.0.1]) by localhost (mail.archive.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 22461-04-13 for ; Tue, 16 May 2006 15:23:59 -0700 (PDT) Received: from [192.168.1.14] (unknown [67.170.222.19]) by mail.archive.org (Postfix) with ESMTP id 9A7601414EC1A for ; Tue, 16 May 2006 15:23:59 -0700 (PDT) Message-ID: <446A5108.4050706@archive.org> Date: Tue, 16 May 2006 15:24:08 -0700 From: Gordon Mohr User-Agent: Mail/News 1.5 (X11/20060309) MIME-Version: 1.0 To: HttpClient Project Subject: Re: [jira] Resolved: (HTTPCORE-3) HttpParser triggers unfriendly OutOfMemoryError on challenging input References: <23689632.1147727886165.JavaMail.jira@brutus> In-Reply-To: <23689632.1147727886165.JavaMail.jira@brutus> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: Debian amavisd-new at archive.org X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N If I understand the HttpCore code properly, there is no direct facility for protecting against the OOME in the code -- just a chance to hook in a theoretical alternate implementation that would address the problem. Is that correct? To use the HttpCore-4.0 facility, it appears I would create my own HttpDataReceiver implementation which keeps a count of the bytes it shovels & throws an IO or HTTP exception when some count is exceeded; create a factory that makes such receivers; install that factory into each HttpClientConnection instance before it begins receiving data. This could work, but seems a roundabout and obscure approach. The really valuable feature would be for OOME-resistance -- and friendly, usable indicators that extreme content has been encountered -- to be features of the library. It's require a switch or paramter to enable, rather than patching in custom/third-party code. Is there a summary of expected dates of Core-4.0/Client-4.0 release somewhere, or any assessments of how the 4.0 codebases match up against 3.0 features? (Is it reasonable for an HttpClient-3.0-using project to consider transitioning to the 4.0 codebase(s)?) - Gordon @ IA Oleg Kalnichevski (JIRA) wrote: > [ http://issues.apache.org/jira/browse/HTTPCORE-3?page=all ] > > Oleg Kalnichevski resolved HTTPCORE-3: > -------------------------------------- > > Resolution: Invalid > > Feel free to re-open the issue if you think the problem has not been adequately resolved > > Oleg > >> HttpParser triggers unfriendly OutOfMemoryError on challenging input >> -------------------------------------------------------------------- >> >> Key: HTTPCORE-3 >> URL: http://issues.apache.org/jira/browse/HTTPCORE-3 >> Project: Jakarta HttpCore >> Type: Bug > >> Components: HttpCore >> Reporter: Gordon Mohr > >> Many users of HttpClient use it to connect to servers which generate challenging HTTP responses, such as responses which include an arbitrarily large number of headers or headers of arbitrarily large size. (Sometimes such headers are conformant with the spec, in that they contain legal characters in plausible header formats; other times these are filled with binary content that is a violation of the relevant specs. Even when technically legal, often such giant headers are the inadvertent result of server-side bugs.) >> As a Java execution environment always has a hard cap on the available heap space, any parsing code which can use an arbitrary amount of memory risks triggering an OutOfMemoryError, either in its own thread or even another thread that happens to need memory after the parsing thread has exhausted it all. >> Such OutOfMemoryErrors are a particularly unfriendly way to indicate that a practical limit has been exceeded, compared to other options. They can hide the thread of execution which is most to blame. It is hard and awkward to set up handlers that catch and recover from OOMEs wherever they are most likely to occur. Even with such handlers, the actual allocation triggering an OOME may occur in another critical thread, even if that thread has minimal and well-controlled memory needs. >> HttpClient ought to provide one or more ways for a user to protect against such OOMEs, and instead receive a more convenient/recoverable indication of an HTTP response that is impossible to process with the HttpClient library within the available resources. Many approaches are possible; the easiest would be to allow a user of HttpClient to set their own optional, pragmatic limits on header sizes and number. Then, just as a user may already cleanly cancel the stream-reading of an arbitrarily-long content-body without fouling up their application state, they would be able to cancel the parsing of oversized response headers. >> Similar issues have been discussed before, for example in Bugzilla bug #25468 (http://issues.apache.org/bugzilla/show_bug.cgi?id=25468) which was to "Provide HttpParser plug-in mechanism." Though that issue is marked resolved/fixed, there is no such plug-in mechanism allowing an OOME workaround in the 3.x HttpClient, and it is not clear that a mechanism/work-around exists in whatever 4.0 work has been completed. >> So my suggestion is that this new issue be used to uniquely track the OOME risk in HttpParser, and only be considered "fixed" when some version of HttpClient offers an alternative to throwing OOMEs as a way of dealing with challenging HTTP responses. Alternatively, this could simply become the issue in the new system for collecting user-contributed workarounds/patches. > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org