hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: Caching and async http client
Date Thu, 05 Jun 2014 10:52:39 GMT
On Wed, 2014-06-04 at 15:38 -0400, Craig Skinfill wrote:
> HTTP Components team - I have some approved time this summer to work on an
> open source project, and I'd like to work on improving the caching support
> in the async http client.  Currently, the requests to the origin are
> non-blocking, but the requests to the cache are blocking.  The async
> caching support appears to be implemented as a decorator of the http
> client, while in the blocking client case its implemented by decorating the
> internal ClientExecChain instance.
> My initial idea was to follow the same pattern in the async client as with
> the blocking client, and use an internal ExecutorService to submit requests
> to the cache, and then block (with a timeout) the returned Future with the
> cache lookup result.  This is of course still blocking, but at least
> provides a potentially configurable timeout when checking the cache.
> How should I approach this?  I see a comment in
> https://issues.apache.org/jira/browse/HTTPASYNC-76 regarding the likely
> need to make changes to the existing blocking http client caching
> implementation along with changes to the core async http client protocol
> pipeline processing.  Are there any existing ideas, plans, etc., for making
> the caching non-blocking for the async client?  Or what changes would be
> needed in the blocking client's caching implementation?
> Is there enough need to make this improvement?
> Thanks.

Hi Craig

Async HTTP caching is a much neglected area in HC. Any contribution
there would be enormously welcome. I, for one, am very happy to have you
on board.

Async HTTP caching is a difficult task from a purely design perspective
and is likely to require several iterations to get things right. In
general non-blocking I/O makes certain things easier but it also other
things much more complex. Content (data) streaming is one of those
things. Standard Java InputStream / OutputStream API is simple and
effective but it is inherently blocking and simply does not work well
with even-driven designs. For non-blocking transports we use consumer /
producer based model that enables reactive programming style and works
well for data intensive applications. The problem is it is damn hard to
organize those consumers and producers into a pipeline based on the
chain of responsibility patterns. The ability to model protocol
processing logic as a sequence of related and interdependent elements is
what makes integration of caching aspects into the blocking client
seamless and efficient. Ideally, we should be able to do the same for
the non-blocking client. Another major issue is that presently HTTP
cache components are tightly coupled with InputStream and the whole
design of the caching APIs is effectively blocking. 

I must confess that I do not see an easy solution to those design
issues. No matter what we do we are likely to end up breaking existing
APIs, which is also a problem. So, I can also well imagine that we make
the decision to _not_ support data streaming with caching at all (at
least initially). If we always buffer messages in memory it would make
it much easier to come up with a reasonable processing pipeline design,
which is asynchronous but only at the HTTP message level. This would
also enable us to fully re-use blocking caching elements without having
to alter them. It might be an unpleasant but necessary compromise.

If this all does not sound too depressing this issue might be a good
starting point. It would also give you a good expose to the existing
code base and API design.



To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org

View raw message