hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Berlin" <sber...@gmail.com>
Subject Re: [HttpCore] NIO extensions: event-driven non-blocking HTTP transport
Date Mon, 16 Oct 2006 17:35:44 GMT
No objection from me.  :)

I agree with your conclusion.  Using NIO classes (that is, channels /
buffers / selectors [not necessarily CharEncoders or decoders]) for
anything other than non-blocking processing is largely a waste, as it
just adds complexity without many, if any, gains.

If you want, you may be able to take some ideas from the non-blocking
http code we added into LimeWire.  We developed a fairly robust state
machine that can have arbitrary read/write states added it.  An
example use for this is adding a state that reads (or writes) headers,
then adding other states that can respond to those headers, then
adding a state that reads or writes the body of the message.  The
specifics of the code may not be relevent (since you're using a
different underlying NIO layer), but the generic idea is the same.

The code is at:
https://www.limewire.org/fisheye/browse/~raw,r=1.6/limecvs/core/com/limegroup/gnutella/io/IOStateMachine.java

The generic IOState interface is:
https://www.limewire.org/fisheye/browse/~raw,r=1.2/limecvs/core/com/limegroup/gnutella/io/IOState.java


And the read/write headers implementation is:
https://www.limewire.org/fisheye/browse/~raw,r=1.2/limecvs/core/com/limegroup/gnutella/http/ReadHeadersIOState.java
and
https://www.limewire.org/fisheye/browse/~raw,r=1.3/limecvs/core/com/limegroup/gnutella/http/WriteHeadersIOState.java

Sam

On 10/12/06, Oleg Kalnichevski <olegk@apache.org> wrote:
> On Mon, 2006-10-09 at 15:38 -0400, Sam Berlin wrote:
> > I can't argue with that. :)
> >
> > Perhaps, while you're developing the API, keep the door open in your
> > mind about this blocking model being just one kind of interface to the
> > data?  That is, considering reading the data via a pluggable
> > 'interpreter' of some sort that's installed over the socket and parses
> > the data to set headers (and other things).  That way, in the future
> > when more time is available for everyone, it won't require breaking
> > backwards compatibility in order to retrofit non-blocking I/O.
> >
>
> Folks,
>
> After having spent some time evaluating the two options, I finally
> concluded we should take time and develop a even driven API for the
> non-blocking HTTP transport as a first step.
>
> All existing NIO HTTP transport implementations that rely on the
> InputStream / OutputStream abstraction I have seen so far suffer from
> the same fatal flaw. They tend to do excessive amount of intermediate
> buffer coping and are very prone to 'out of memory' conditions under
> heavy load. My goal is to make sure that the NIO based transport is not
> only non-blocking but is also memory efficient. In those cases where no
> content encoding/decoding is involved I want to make sure that the HTTP
> service can _directly_ write to / read from the underlying socket
> channel without any intermediate buffering.
>
> This decision also entails some refactoring in HttpCore proper, as I
> need certain classes decoupled from InputStream / OutputStream in order
> to be usable in NIO extensions. I will have to factor out the content
> length strategy code from the DefaultEntityDeserializer and
> DefaultEntitySerializer classes.
>
> Any major objections to that?
>
> Cheers
>
> Evil Comrade Oleg
>
> > I also apologize for only providing my thoughts and not actual code
> > (as I've said I would a few times over the past few years), and am
> > very grateful to you and Roland and the others who do provide code.
> > As I'm sure you all know, it's difficult to find the time around a day
> > job that is constantly shifting directions.
> >
> > Sam
> >
> > On 10/9/06, Oleg Kalnichevski <olegk@apache.org> wrote:
> > > On Mon, 2006-10-09 at 13:26 -0400, Sam Berlin wrote:
> > > > I agree with Robert that it is much easier to go from an event driven
> > > > model to a blocking model.  If the first layer that HttpNIO exposes is
> > > > blocking, there'd need to be additional hacking below that in order to
> > > > remove the blocking / thread-based layer.  On the other hand, if the
> > > > first layer it exposes is non-blocking, it's relatively trivial to add
> > > > a thread ontop of that and expose an additional blocking layer.
> > > >
> > > > It is difficult to think of many scenarios that require (or a better
> > > > with) non-blocking I/O, but I would caution against excluding them
> > > > from HttpClient's scope.  If HttpClient 4.0 had been ready a year or
> > > > so ago (with an exposed non-blocking layer), we would definitely have
> > > > used it in LimeWire as the basis of file-transfers.  As-is, we
> > > > invented our own minimalistic non-blocking state-based http transport
> > > > for downloads.
> > > >
> > > > If the non-blocking layer is there, I guarantee that folks will be
> > > > able to find a use for it.  Whereas if only a blocking layer is there,
> > > > those developers looking for the high-performance asyncronous model
> > > > will have to go elsewhere.
> > > >
> > >
> > > Sam, Robert, et al
> > >
> > > Simply for practical reasons while there are only two guys hacking (me
> > > and Roland) we ought not spread out efforts too thin. A full-blown
> > > even-driven API will take time to get right. I think it is more
> > > important to get HttpCore ALPHA3 release that covers 95% of use cases
> > > out the door rather sooner than later. Beyond that it is just a matter
> > > of priorities and available time.
> > >
> > > Oleg
> > >
> > >
> > > > Sam
> > > >
> > > > On 10/9/06, Robert Olofsson <robo@khelekore.org> wrote:
> > > > >
> > > > > Since my proxy is almost fully nio/event based I would like to share
> > > > > a few comments.
> > > > >
> > > > > Oleg Kalnichevski wrote:
> > > > > > I am still quite skeptical about usefulness of a fully event-driven
HTTP
> > > > > > transport for one simple reason: asynchronous (non-blocking)
I/O
> > > > > > transport makes no sense of what so ever if the process of content
> > > > > > generation or content consumption is asynchronous (blocking).
> > > > >
> > > > > There are many things that may block here are a few examples:
> > > > > *) DNS-lookup
> > > > > *) File reading writing
> > > > > *) Database access
> > > > > *) All higher level api:s that only give you a stream.
> > > > > *) Calls to Runtime.exec
> > > > >
> > > > > That DNS lookups are also totally single threaded in native code
in
> > > > > some systems does not make things better.
> > > > >
> > > > > > If one
> > > > > > needs a worker thread to generate / process content anyways,
what is the
> > > > > > point of having an even driven transport?
> > > > >
> > > > > Agreed.
> > > > > One objection here may be that you do not need one worker thread
for all
> > > > > of the content generation, but that usually does not make things
better.
> > > > > You will still need the worker thread for the _slow_and_blocking_
> > > > > operation.
> > > > > So if content generation/modification uses any of the above then
using
> > > > > workers simplify things a lot.
> > > > >
> > > > > > I see only a few scenarios
> > > > > > where the third choice (event callbacks) may prove advantageous,
> > > > > > primarily in HTTP proxies and gateways.
> > > > >
> > > > > Except that http proxies does lots of dns lookups so they will block
> > > > > a lot. My proxy spawns worker thread only when they need to, but
it
> > > > > complicates some part of the code.
> > > > >
> > > > > That my proxy also modifies the content and caches the data will
mean
> > > > > lots of other blocking calls in some of the code paths.
> > > > >
> > > > > > I think ultimately we need both options. I suggest we start
with the
> > > > > > second option, release ALPHA3 and then consider implementing
the third
> > > > > > option before ALPHA4 / BETA1.
> > > > >
> > > > > One thing to keep in mind:
> > > > > It is easier to go from event driven to a blocking model than to
do the
> > > > > reverse. This may be an argument to go for number 3 (full nio).
> > > > > If you go for 3 then make it easy to use a few selector-threads
> > > > > otherwise the system will use only 1 cpu (or 1 core).
> > > > >
> > > > > /robo
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> > > > >
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> > > >
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message