Return-Path: X-Original-To: apmail-hc-httpclient-users-archive@www.apache.org Delivered-To: apmail-hc-httpclient-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2504810FB0 for ; Fri, 29 Nov 2013 14:29:12 +0000 (UTC) Received: (qmail 50581 invoked by uid 500); 29 Nov 2013 14:29:10 -0000 Delivered-To: apmail-hc-httpclient-users-archive@hc.apache.org Received: (qmail 50549 invoked by uid 500); 29 Nov 2013 14:29:08 -0000 Mailing-List: contact httpclient-users-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpClient User Discussion" Delivered-To: mailing list httpclient-users@hc.apache.org Received: (qmail 50532 invoked by uid 99); 29 Nov 2013 14:29:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Nov 2013 14:29:07 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [217.150.250.48] (HELO kalnich.nine.ch) (217.150.250.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Nov 2013 14:29:02 +0000 Received: from [192.168.42.135] (unknown [213.55.184.163]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by kalnich.nine.ch (Postfix) with ESMTPSA id 7C581B800AE for ; Fri, 29 Nov 2013 15:28:39 +0100 (CET) Message-ID: <1385735317.10107.5.camel@ubuntu> Subject: Re: Excessive buffering for chunked responses when gzip is used From: Oleg Kalnichevski To: HttpClient User Discussion Date: Fri, 29 Nov 2013 15:28:37 +0100 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.8.4-0ubuntu1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Fri, 2013-11-29 at 17:20 +0400, Alexey Ermakov wrote: > Hi, > > One of the services I'm using serves realtime data via HTTP using Twitter-like streaming scheme (chunked transfer encoding, JSON messages separated by \r\n). While trying to consume this data using HttpClient (4.3.1 if that matters) I quickly ran into a problem. When the data is sent uncompressed (either by requesting it directly from the backend or by disabling compression) the read() on the response entity's input stream completes as soon as the server sends me another chunk. However, if gzip is enabled (which is what we use in production by compressing the data via nginx), the read() gets stuck for a rather long time, presumably until some buffer inside HC fills. Since the messages themselves are very small and similar, they tend to compress very well, which results in severe lag. > The problem isn't encountered when the same data is consumed using AsyncHttpClient or plain old curl --compress -N, so it must be an issue with AHC. I couldn't find any relevant RequestConfig/ConnectionConfig settings that would help, is there anything I'm missing? > > Scala code + nginx config that could be used to reproduce the issue are here , I can rewrite in Java if that will help. Alexey, HttpClient makes use of standard Java GZIPInputStream class to decompress GZIP encoded content entities. Whatever buffering is going on in that class we, as HttpClient developers, have little to no control over. GZIPInputStream is known to have been a trouble-maker in the past [1]. Give the patch attached to the issue a try and let me know if that fixes the problem for you. Oleg [1] https://issues.apache.org/jira/browse/HTTPCLIENT-1403 --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org For additional commands, e-mail: httpclient-users-help@hc.apache.org