From "Moore, Jonathan" <Jonathan_Mo...@Comcast.com>
Subject RE: roadmap for caching module
Date Wed, 29 Sep 2010 14:27:29 GMT
Hi Oleg,

We actually have a dev team currently working on the cache-based 304s, and they will be tackling
the 'variant miss', only-if-cached, and heuristic caching in short order. I think it is possible
we might see those done within the next two weeks or so.

At that point, I would feel pretty comfortable going for option (1) of releasing to GA with
caveats, as you mention. We'll benefit more from community input and usage than we will from
implementing the nice-to-haves of request collapsing, RFC5861, or byterange handling, and
I would be comfortable having those wait until 4.2.

We could also take the approach of drastically limiting the public API to the point of just
allowing people to instantiate the existing client. I've been thinking about separating the
current CachingHttpClient class out into a fa├žade class and an implementation class. So CachingHttpClientImpl
would be package-private and would have all the guts but just a single constructor that takes
all dependencies; all of the current instantiation work that goes on in some of the constructors
can then happen in the CachingHttpClientFacade (class names for illustration purposes only...).

Then we have the CachingHttpClientFacade just have the following constructors (plus variants
for each that also take a CacheConfig and/or the backing HttpClient):

public CachingHttpClientFacade(); // default in-memory implementation
public CachingHttpClientFacade(Ehcache storage);
public CachingHttpClientFacade(MemcachedClientIF storage);

So the only things a client can instantiate are a CachingHttpClientFacade and a CacheConfig.
This has the temporary drawback that folks won't be able to extend the implementation without
patching, but it vastly reduces the API surface and hence our risk of breaking binary compatibility
as we add the other features. Maybe it also encourages people to submit patches. :)

I recently read an article somewhere where a project only exposes public API when someone
asks for it; we could take the same approach here and just really lock it down while we're
under development, rather than trying to anticipate how people will want to extend it now.


-----Original Message-----
From: Oleg Kalnichevski [mailto:olegk@apache.org] 
Sent: Wednesday, September 29, 2010 4:39 AM
To: HttpComponents Project
Subject: Re: roadmap for caching module

On Mon, 2010-09-27 at 14:12 -0400, Moore, Jonathan wrote:
> Hi folks,
> After an email exchange with Oleg, I wanted to identify a bit of a
> roadmap for the caching module in HttpClient 4.1.X. Primarily this is an
> exercise to identify all the things we'd like to do as a way to think
> about whether the current class design can easily accommodate them or
> not. As we are trying to get the 4.1 release through to GA, it will be
> good to understand how the public API of the caching module might or
> might not change.
> At this point, we've gotten the caching module to the point where I
> believe we can call it 'HTTP/1.1 conditionally compliant' (implements
> all the MUSTs and MUST NOTs from RFC 2616). As such, I think it is
> already in a pretty usable and useful state where folks can begin
> getting value from it. We have unit tests in the
> TestProtocolRequirements that should help guarantee we won't break that
> going forward.
> As for upcoming work we'd like to do, I think these can be grouped into
> a few different sections.
> (These are SHOULD/SHOULD NOT statements from RFC 2616, and will move us
> toward unconditional compliance and hence greater interoperability.
> Eventually we should consider looking through the entire RFC for these
> recommendations for a transparent proxy cache, but for the time being
> I'll focus on those in the caching section (Section 13) and in the
> description for Cache-Control and conditional headers in Section 14).
> * send etags of all variants when servicing a 'variant miss' (13.6)
> * support If-None-Match and If-Modified-Since for cache hits (14.25,
> 14.26)
> * support only-if-cached Cache-Control directive (14.9.4)
> (These are specified as MAY statements in the RFC).
> * support heuristic caching (13.2.2, 13.2.4) including a default
> freshness lifetime; this is in JIRA as HTTPCLIENT-990
> * support byterange requests and partial responses (3.12, 10.2.7,
> 13.5.4, 14.16, 14.27, 14.35)
> * support RFC 5861 (stale-while-revalidate and stale-on-error); this is
> in JIRA already as HTTPCLIENT-975
> * request collapsing: if a request comes in for a resource that is
> currently being revalidated, wait for the result of the other
> revalidation rather than sending another revalidation downstream
> We're actually planning to get to some of the "PROTOCOL RECOMMENDATIONS"
> section shortly. RFC 5861 and request collapsing will be interesting as
> they will introduce some asynchrony of operation and hence more
> synchronization trickery, although I think everything else can be
> relatively straightforwardly accommodated in the current design.
> Do folks have opinions as to relative priority? I think I've laid them
> out roughly "most desirable" to "least desirable", with the exception of
> the byterange/partial response support, which I would actually put last
> in priority order (I don't think this is commonly used and adds quite a
> bit of complexity, so it doesn't have a lot of bang for the buck).
> Jon

Jon et al

The roadmap looks pretty ambitious and is likely to take many months of
work to get there. This, however, ideally should not block 4.1 GA. So,
we have got to make a decision on the release strategy as well. 

I personally see two options right now

(1) release 4.1-beta1 shortly; release 4.1 GA Q4 2010; put a big fat
disclaimer on the caching module clearly stating its experimental
status, and, though the best effort will be made to keep it binary / API
compatible, incompatible changes may be made in the following 4.x

(2) release 4.1-alpha3 shortly; spend more time polishing the API;
release 4.1-beta1 Q1 2011; release 4.1 GA Q2 2011; 

Effectively, it all boils down to how comfortable we are with the API
and whether a few more months can make a difference or not.



