hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject [HttpCore] NIO extensions: event-driven non-blocking HTTP transport
Date Thu, 12 Oct 2006 20:58:39 GMT
On Mon, 2006-10-09 at 15:38 -0400, Sam Berlin wrote:
> I can't argue with that. :)
> 
> Perhaps, while you're developing the API, keep the door open in your
> mind about this blocking model being just one kind of interface to the
> data?  That is, considering reading the data via a pluggable
> 'interpreter' of some sort that's installed over the socket and parses
> the data to set headers (and other things).  That way, in the future
> when more time is available for everyone, it won't require breaking
> backwards compatibility in order to retrofit non-blocking I/O.
> 

Folks,

After having spent some time evaluating the two options, I finally
concluded we should take time and develop a even driven API for the
non-blocking HTTP transport as a first step. 

All existing NIO HTTP transport implementations that rely on the
InputStream / OutputStream abstraction I have seen so far suffer from
the same fatal flaw. They tend to do excessive amount of intermediate
buffer coping and are very prone to 'out of memory' conditions under
heavy load. My goal is to make sure that the NIO based transport is not
only non-blocking but is also memory efficient. In those cases where no
content encoding/decoding is involved I want to make sure that the HTTP
service can _directly_ write to / read from the underlying socket
channel without any intermediate buffering.  

This decision also entails some refactoring in HttpCore proper, as I
need certain classes decoupled from InputStream / OutputStream in order
to be usable in NIO extensions. I will have to factor out the content
length strategy code from the DefaultEntityDeserializer and
DefaultEntitySerializer classes.

Any major objections to that?

Cheers

Evil Comrade Oleg

> I also apologize for only providing my thoughts and not actual code
> (as I've said I would a few times over the past few years), and am
> very grateful to you and Roland and the others who do provide code.
> As I'm sure you all know, it's difficult to find the time around a day
> job that is constantly shifting directions.
> 
> Sam
> 
> On 10/9/06, Oleg Kalnichevski <olegk@apache.org> wrote:
> > On Mon, 2006-10-09 at 13:26 -0400, Sam Berlin wrote:
> > > I agree with Robert that it is much easier to go from an event driven
> > > model to a blocking model.  If the first layer that HttpNIO exposes is
> > > blocking, there'd need to be additional hacking below that in order to
> > > remove the blocking / thread-based layer.  On the other hand, if the
> > > first layer it exposes is non-blocking, it's relatively trivial to add
> > > a thread ontop of that and expose an additional blocking layer.
> > >
> > > It is difficult to think of many scenarios that require (or a better
> > > with) non-blocking I/O, but I would caution against excluding them
> > > from HttpClient's scope.  If HttpClient 4.0 had been ready a year or
> > > so ago (with an exposed non-blocking layer), we would definitely have
> > > used it in LimeWire as the basis of file-transfers.  As-is, we
> > > invented our own minimalistic non-blocking state-based http transport
> > > for downloads.
> > >
> > > If the non-blocking layer is there, I guarantee that folks will be
> > > able to find a use for it.  Whereas if only a blocking layer is there,
> > > those developers looking for the high-performance asyncronous model
> > > will have to go elsewhere.
> > >
> >
> > Sam, Robert, et al
> >
> > Simply for practical reasons while there are only two guys hacking (me
> > and Roland) we ought not spread out efforts too thin. A full-blown
> > even-driven API will take time to get right. I think it is more
> > important to get HttpCore ALPHA3 release that covers 95% of use cases
> > out the door rather sooner than later. Beyond that it is just a matter
> > of priorities and available time.
> >
> > Oleg
> >
> >
> > > Sam
> > >
> > > On 10/9/06, Robert Olofsson <robo@khelekore.org> wrote:
> > > >
> > > > Since my proxy is almost fully nio/event based I would like to share
> > > > a few comments.
> > > >
> > > > Oleg Kalnichevski wrote:
> > > > > I am still quite skeptical about usefulness of a fully event-driven
HTTP
> > > > > transport for one simple reason: asynchronous (non-blocking) I/O
> > > > > transport makes no sense of what so ever if the process of content
> > > > > generation or content consumption is asynchronous (blocking).
> > > >
> > > > There are many things that may block here are a few examples:
> > > > *) DNS-lookup
> > > > *) File reading writing
> > > > *) Database access
> > > > *) All higher level api:s that only give you a stream.
> > > > *) Calls to Runtime.exec
> > > >
> > > > That DNS lookups are also totally single threaded in native code in
> > > > some systems does not make things better.
> > > >
> > > > > If one
> > > > > needs a worker thread to generate / process content anyways, what
is the
> > > > > point of having an even driven transport?
> > > >
> > > > Agreed.
> > > > One objection here may be that you do not need one worker thread for all
> > > > of the content generation, but that usually does not make things better.
> > > > You will still need the worker thread for the _slow_and_blocking_
> > > > operation.
> > > > So if content generation/modification uses any of the above then using
> > > > workers simplify things a lot.
> > > >
> > > > > I see only a few scenarios
> > > > > where the third choice (event callbacks) may prove advantageous,
> > > > > primarily in HTTP proxies and gateways.
> > > >
> > > > Except that http proxies does lots of dns lookups so they will block
> > > > a lot. My proxy spawns worker thread only when they need to, but it
> > > > complicates some part of the code.
> > > >
> > > > That my proxy also modifies the content and caches the data will mean
> > > > lots of other blocking calls in some of the code paths.
> > > >
> > > > > I think ultimately we need both options. I suggest we start with
the
> > > > > second option, release ALPHA3 and then consider implementing the
third
> > > > > option before ALPHA4 / BETA1.
> > > >
> > > > One thing to keep in mind:
> > > > It is easier to go from event driven to a blocking model than to do the
> > > > reverse. This may be an argument to go for number 3 (full nio).
> > > > If you go for 3 then make it easy to use a few selector-threads
> > > > otherwise the system will use only 1 cpu (or 1 core).
> > > >
> > > > /robo
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message