hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: thoughts on httpclient decorators
Date Sun, 01 Jul 2012 14:02:01 GMT
On Wed, 2012-06-06 at 17:49 -0400, Jon Moore wrote:
> Great! I'm glad it seems to be working out. I'm all in favor of taking a
> bunch of smaller refactoring steps to get there, to lessen the chances we
> break things in there.
> 
> I've been thinking about relative ordering of the layers.
> 
> I think the caching layer has to be pretty close to the bottom, perhaps the
> first decorator around BasicHttpClient, primarily because variant-related
> headers(e.g. decompression), authentication, and cookie handling all have
> an impact on the correct operation of the cache. I think redirect handling
> has an impact too, because it is not transparent and may hide the actual
> origin of a response on a cache hit (where it wouldn't get populated
> correctly into the HttpContext).
> 
> I haven't gotten any further than that at this point--I'm not familiar with
> the relative interactions of authentication and redirects, for example.
> Looking forward to seeing where you get with the code.
> 
> Jon
> 

Hi Jon

I completed the initial refactoring of the request execution logic into
a number of much smaller classes organized as a chain of command objects
(similarly to the chain of responsibility pattern) each taking care of
just one specific aspect of the request execution process [1].

Presently the execution chain is now organized as follows

backoff strategy exec -> service unavailable (503) retry exec ->
redirect exec -> i/o retry exec -> protocol exec -> main exec

Any of those executors can be removed if not required thus making the
execution chain shorter and more efficient. HttpClient instances can be
built with the new HttpClientBuilder class and are now immutable.

The caching aspect can probably be inserted just before the last
executor giving it access to a fully constructed request object with all
the headers (thus fixing HTTPCLIENT-1190) and to the original,
unmodified response object (thus fixing issues with transparent entity
decompression).

Overall, I think it is a much better design. Thank you very much for
inspiring me to take a stab at it. MainClientExec class is still a
monstrosity but it is now 600 line long instead of 2000+, which will
undoubtedly make it more maintainable.

Please review the changes in the decorator-refactoring and let me know
what you think of it. 

Oleg

[1]
https://svn.apache.org/repos/asf/httpcomponents/httpclient/branches/decorator-refactoring/


> On Wed, Jun 6, 2012 at 4:25 PM, Oleg Kalnichevski <olegk@apache.org> wrote:
> 
> > On Sat, 2012-06-02 at 15:15 +0200, Oleg Kalnichevski wrote:
> > > On Fri, 2012-06-01 at 10:07 -0400, Jon Moore wrote:
> > > > As we've seen, we've run into a couple of problems with some of our
> > > > decorators for the DefaultHttpClient where underlying backends do
> > > > non-transparent things (ranging from decompressing content bodies
> > without
> > > > updating response headers, to following redirects). I do think the
> > > > decorators provide a nice way to layer the overall functionality from
a
> > > > codebase point of view, but I think they are brittle because they rely
> > on
> > > > unenforceable assumptions (e.g. that backend clients that get injected
> > > > behave transparently).
> > > >
> > > > I think the solution to this is to continue using the decorators, but
> > hide
> > > > them from most users--in other words, reimplementing the
> > DefaultHttpClient
> > > > as a stack of decorators that get enabled/disabled by configuration. In
> > > > other words, a user should be able to "new DefaultHttpClient()" and get
> > > > something that is capable of caching, following redirects, and handling
> > > > compression (per configuration).
> > > >
> > > > I think this looks like:
> > > > (1) generate a decorator for redirect following
> > > > (2) create a new class (BasicHttpClient?) that is essentially
> > > > DefaultHttpClient minus the redirect logic
> > > > (3) reimplement DefaultHttpClient so that it generates and configures
a
> > > > stack of RedirectFollowingHttpClient, DecompressingHttpClient,
> > > > CachingHttpClient, BasicHttpClient (this might mean needing to pull the
> > > > httpclient-cache module into the httpclient module)
> > > > (4) add warning documentation on the DecompressingHttpClient and
> > > > CachingHttpClient about needing to wire them up correctly (this should
> > > > probably happen first, actually)
> > > >
> > > > I suspect there are actually other decorators lurking in there too
> > > > (authentication? cookie handling?).
> > > >
> > > > I think this potentially gives us a clean way to layer in
> > > > functionality--implement it as a decorator and then figure out where
> > in the
> > > > stack is the right/safe way to add it. Opting into the new
> > functionality is
> > > > a matter of turning it on in the config for DefaultHttpClient, which
> > is far
> > > > simpler for users to do.
> > > >
> > > > Thoughts?
> > > >
> > > > Jon
> > >
> > > Hi Jon
> > >
> > > Absolutely, having a number of smaller specialized components combined
> > > together into a processing chain (based on the chain of responsibility
> > > pattern0 would make for a much better design and clear API. There is,
> > > unfortunately, a reason why redirect and authentication handling logic
> > > is currently crammed into one class (DefaultRequestDirector). There can
> > > be really complex scenarios where a redirect can cause re-authentication
> > > with the site and successful authentication can trigger a new redirect,
> > > which in its turn causes another re-authentication. It took _literally_
> > > 10 years to get all those scenarios working properly, especially those
> > > involving NTLM. The last known issue related to HTTP authentication was
> > > resolved just recently in the 4.2 release. Speaking from experience
> > > troubleshooting such complex HTTP exchanges I suspect redirect and
> > > authentication handling logic are simply inseparable and must be
> > > executed in a single request execution loop. To make matters worse, many
> > > of those scenarios cannot be easily simulated (mainly thanks to NTLM)
> > > and therefore are not covered in our unit and integration tests.
> > >
> > > Having said all that please do not feel discouraged from pursuing the
> > > idea of splitting DefaultRequestDirector into smaller chunks of code and
> > > moving redirect handling code into a separate HttpClient decorator. We
> > > probably can still settle for a middle ground approach: leave redirect /
> > > auth logic in DefaultRequestDirector but disable it when using
> > > HttpClient decorators. In any way I really like the idea of developing a
> > > builder class to instantiate HttpClient for complex scenarios involving
> > > caching / transparent decompression / authentication and transparent
> > > redirect handing. Feel free to create a branch for 4.2.x releases and
> > > start working on the trunk or on an experimental branch.
> > >
> > > cheers
> > >
> > > Oleg
> > >
> >
> > Hi Jon
> >
> > I started experimenting with your idea on a branch [1] and now think it
> > is doable. It is a much nicer design which should result in much cleaner
> > code. However, a successful implementation of the new design will
> > effectively amount to a complete rewrite of DefaultRequestDirector,
> > which is probably the single most complex piece of code in HttpClient.
> > I'll try to get something you could look at toward the end of the week
> >
> > Cheers
> >
> > Oleg
> >
> > [1]
> >
> > https://svn.apache.org/repos/asf/httpcomponents/httpclient/branches/decorator-refactoring/
> >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> > > For additional commands, e-mail: dev-help@hc.apache.org
> > >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> > For additional commands, e-mail: dev-help@hc.apache.org
> >
> >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message