hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Moore, Jonathan (CIM)" <Jonathan_Mo...@Comcast.com>
Subject Re: 4.2.3 gzip caching still broken?
Date Thu, 11 Apr 2013 18:59:00 GMT
Hi Adam,

This is related to the way the caching module handles cache variants. In this case I suspect
the origin is (correctly) setting Vary: Accept-Encoding.

There are two cache entries here; however only one of them should have the response body present,
IIRC. The version without the prepended request headers is treated as the "parent" entry and
the one with the headers is the actual cached variant.

This structure is in place because certain requests that pass through the cache are required
to invalidate all the variants for that URL, so we need someplace to tie those together. It
doubles the header space used but not double the response body space. As there are more variants
the overhead for the duplicated headers drops further.

Now, because you are seeing {Accept-Encoding=} this indicates that when the cache saw the
request come through it did not have the Accept-Encoding: gzip on it at that point. I think
this means you have the DecompressingHttpClient and CachingHttpClient wired up backwards.

You want the CachingHttpClient as close to the final DefaultHttpClient as possible. So these
should be layered as:

DecompressingHttpClient -> CachingHttpClient -> DefaultHttpClient

One of the updates in the 4.3 release will take more care of this wiring for you out of the

Jon Moore

On Apr 11, 2013, at 2:20 PM, "Adam Patacchiola" <adam@2fours.com> wrote:

> I have done some more research on this and it appears that the caching is
> working, however it is adding 2 entries to the backing cache: one each with
> and without the url pre-pended by {Accept-Encoding=}. This results in a
> cache miss for the get with the pre-pended url, and uses double the storage
> space in whatever mechanism you are backing the client with. There was a
> bug in my backing store which led to me initially believing it was not
> caching the (correct) url.
> tl;dr it is caching but adding a duplicate invalid entry that never gets
> hit.
> On Thu, Apr 11, 2013 at 10:13 AM, Adam Patacchiola <adam@2fours.com> wrote:
>> I'm using 4.2.3 with gzip compression and CachingHttpClient. Initially I
>> implemented the custom request/response interceptors as described here:
>> https://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientGZipContentCompression.java
>> did not work, resulting in the issue described here:
>> https://issues.apache.org/jira/browse/HTTPCLIENT-1163.
>> It appeared to me from reading this issue that using the
>> "CompressionDecorator" would resolve the issue so I modified my code to use
>> DecompressingHttpClient (
>> https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/client/DecompressingHttpClient.html)
>> but the issue still persists as we can see from the below log output. It is
>> caching using one (broken?) key but then looking it up using a different
>> (correct?) key which is consistent with the bug above:
>> 04-11 09:32:54.760: ... putting cache entry, url: {Accept-Encoding=}
>> https://www.surespot.me:8080/images/b:f1/165
>> 04-11 09:32:55.965: ... Cache miss [host: https://www.surespot.me:8080;
>> uri: https://www.surespot.me:8080/images/b:f1/165]
>> Am I missing something or is this still broken?
>> Thanks,
>> Adam

To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org

View raw message