Return-Path: Delivered-To: apmail-jakarta-httpclient-dev-archive@www.apache.org Received: (qmail 35342 invoked from network); 16 Oct 2006 17:36:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 16 Oct 2006 17:36:20 -0000 Received: (qmail 81994 invoked by uid 500); 16 Oct 2006 17:36:09 -0000 Delivered-To: apmail-jakarta-httpclient-dev-archive@jakarta.apache.org Received: (qmail 81967 invoked by uid 500); 16 Oct 2006 17:36:09 -0000 Mailing-List: contact httpclient-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "HttpClient Project" Reply-To: "HttpClient Project" Delivered-To: mailing list httpclient-dev@jakarta.apache.org Received: (qmail 81946 invoked by uid 99); 16 Oct 2006 17:36:09 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Oct 2006 10:36:09 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of sberlin@gmail.com designates 64.233.162.197 as permitted sender) Received: from [64.233.162.197] (HELO nz-out-0102.google.com) (64.233.162.197) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Oct 2006 10:36:08 -0700 Received: by nz-out-0102.google.com with SMTP id 13so585344nzn for ; Mon, 16 Oct 2006 10:35:45 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=q9hxiF0Lg3fi0nIpVpAKr6F8qhXf3L276rVOTEDVI2HxSI92oTD0Umq7V4WfHEhT/jV3ZVGOIQ9eWkspCOVN3MlFOwGGNpmXrG3KBIj96+ZaRrAc7tHR2gOKf9pRbeS1H+PqdHxz++v9qF8c8T5yn2ahrJ0R2u+0mZdAu4mFa2Y= Received: by 10.65.185.13 with SMTP id m13mr8071785qbp; Mon, 16 Oct 2006 10:35:44 -0700 (PDT) Received: by 10.65.147.2 with HTTP; Mon, 16 Oct 2006 10:35:44 -0700 (PDT) Message-ID: <19196d860610161035r61caae74l2aa91c28469572c4@mail.gmail.com> Date: Mon, 16 Oct 2006 13:35:44 -0400 From: "Sam Berlin" To: "HttpClient Project" Subject: Re: [HttpCore] NIO extensions: event-driven non-blocking HTTP transport In-Reply-To: <1160686720.4968.28.camel@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1160062264.4939.5.camel@localhost.localdomain> <1160157101.5324.43.camel@localhost.localdomain> <4527D46C.1090709@dubioso.net> <19196d860610070933i4e06b97er8138ff6d02cb6918@mail.gmail.com> <1160324107.4899.23.camel@localhost.localdomain> <452A6BF3.10806@khelekore.org> <19196d860610091026o2acedfe4r5dc4f533b5ac7e0e@mail.gmail.com> <1160418483.4972.21.camel@localhost.localdomain> <19196d860610091238s7229cbafof1feecc30b6f69cf@mail.gmail.com> <1160686720.4968.28.camel@localhost.localdomain> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N No objection from me. :) I agree with your conclusion. Using NIO classes (that is, channels / buffers / selectors [not necessarily CharEncoders or decoders]) for anything other than non-blocking processing is largely a waste, as it just adds complexity without many, if any, gains. If you want, you may be able to take some ideas from the non-blocking http code we added into LimeWire. We developed a fairly robust state machine that can have arbitrary read/write states added it. An example use for this is adding a state that reads (or writes) headers, then adding other states that can respond to those headers, then adding a state that reads or writes the body of the message. The specifics of the code may not be relevent (since you're using a different underlying NIO layer), but the generic idea is the same. The code is at: https://www.limewire.org/fisheye/browse/~raw,r=1.6/limecvs/core/com/limegroup/gnutella/io/IOStateMachine.java The generic IOState interface is: https://www.limewire.org/fisheye/browse/~raw,r=1.2/limecvs/core/com/limegroup/gnutella/io/IOState.java And the read/write headers implementation is: https://www.limewire.org/fisheye/browse/~raw,r=1.2/limecvs/core/com/limegroup/gnutella/http/ReadHeadersIOState.java and https://www.limewire.org/fisheye/browse/~raw,r=1.3/limecvs/core/com/limegroup/gnutella/http/WriteHeadersIOState.java Sam On 10/12/06, Oleg Kalnichevski wrote: > On Mon, 2006-10-09 at 15:38 -0400, Sam Berlin wrote: > > I can't argue with that. :) > > > > Perhaps, while you're developing the API, keep the door open in your > > mind about this blocking model being just one kind of interface to the > > data? That is, considering reading the data via a pluggable > > 'interpreter' of some sort that's installed over the socket and parses > > the data to set headers (and other things). That way, in the future > > when more time is available for everyone, it won't require breaking > > backwards compatibility in order to retrofit non-blocking I/O. > > > > Folks, > > After having spent some time evaluating the two options, I finally > concluded we should take time and develop a even driven API for the > non-blocking HTTP transport as a first step. > > All existing NIO HTTP transport implementations that rely on the > InputStream / OutputStream abstraction I have seen so far suffer from > the same fatal flaw. They tend to do excessive amount of intermediate > buffer coping and are very prone to 'out of memory' conditions under > heavy load. My goal is to make sure that the NIO based transport is not > only non-blocking but is also memory efficient. In those cases where no > content encoding/decoding is involved I want to make sure that the HTTP > service can _directly_ write to / read from the underlying socket > channel without any intermediate buffering. > > This decision also entails some refactoring in HttpCore proper, as I > need certain classes decoupled from InputStream / OutputStream in order > to be usable in NIO extensions. I will have to factor out the content > length strategy code from the DefaultEntityDeserializer and > DefaultEntitySerializer classes. > > Any major objections to that? > > Cheers > > Evil Comrade Oleg > > > I also apologize for only providing my thoughts and not actual code > > (as I've said I would a few times over the past few years), and am > > very grateful to you and Roland and the others who do provide code. > > As I'm sure you all know, it's difficult to find the time around a day > > job that is constantly shifting directions. > > > > Sam > > > > On 10/9/06, Oleg Kalnichevski wrote: > > > On Mon, 2006-10-09 at 13:26 -0400, Sam Berlin wrote: > > > > I agree with Robert that it is much easier to go from an event driven > > > > model to a blocking model. If the first layer that HttpNIO exposes is > > > > blocking, there'd need to be additional hacking below that in order to > > > > remove the blocking / thread-based layer. On the other hand, if the > > > > first layer it exposes is non-blocking, it's relatively trivial to add > > > > a thread ontop of that and expose an additional blocking layer. > > > > > > > > It is difficult to think of many scenarios that require (or a better > > > > with) non-blocking I/O, but I would caution against excluding them > > > > from HttpClient's scope. If HttpClient 4.0 had been ready a year or > > > > so ago (with an exposed non-blocking layer), we would definitely have > > > > used it in LimeWire as the basis of file-transfers. As-is, we > > > > invented our own minimalistic non-blocking state-based http transport > > > > for downloads. > > > > > > > > If the non-blocking layer is there, I guarantee that folks will be > > > > able to find a use for it. Whereas if only a blocking layer is there, > > > > those developers looking for the high-performance asyncronous model > > > > will have to go elsewhere. > > > > > > > > > > Sam, Robert, et al > > > > > > Simply for practical reasons while there are only two guys hacking (me > > > and Roland) we ought not spread out efforts too thin. A full-blown > > > even-driven API will take time to get right. I think it is more > > > important to get HttpCore ALPHA3 release that covers 95% of use cases > > > out the door rather sooner than later. Beyond that it is just a matter > > > of priorities and available time. > > > > > > Oleg > > > > > > > > > > Sam > > > > > > > > On 10/9/06, Robert Olofsson wrote: > > > > > > > > > > Since my proxy is almost fully nio/event based I would like to share > > > > > a few comments. > > > > > > > > > > Oleg Kalnichevski wrote: > > > > > > I am still quite skeptical about usefulness of a fully event-driven HTTP > > > > > > transport for one simple reason: asynchronous (non-blocking) I/O > > > > > > transport makes no sense of what so ever if the process of content > > > > > > generation or content consumption is asynchronous (blocking). > > > > > > > > > > There are many things that may block here are a few examples: > > > > > *) DNS-lookup > > > > > *) File reading writing > > > > > *) Database access > > > > > *) All higher level api:s that only give you a stream. > > > > > *) Calls to Runtime.exec > > > > > > > > > > That DNS lookups are also totally single threaded in native code in > > > > > some systems does not make things better. > > > > > > > > > > > If one > > > > > > needs a worker thread to generate / process content anyways, what is the > > > > > > point of having an even driven transport? > > > > > > > > > > Agreed. > > > > > One objection here may be that you do not need one worker thread for all > > > > > of the content generation, but that usually does not make things better. > > > > > You will still need the worker thread for the _slow_and_blocking_ > > > > > operation. > > > > > So if content generation/modification uses any of the above then using > > > > > workers simplify things a lot. > > > > > > > > > > > I see only a few scenarios > > > > > > where the third choice (event callbacks) may prove advantageous, > > > > > > primarily in HTTP proxies and gateways. > > > > > > > > > > Except that http proxies does lots of dns lookups so they will block > > > > > a lot. My proxy spawns worker thread only when they need to, but it > > > > > complicates some part of the code. > > > > > > > > > > That my proxy also modifies the content and caches the data will mean > > > > > lots of other blocking calls in some of the code paths. > > > > > > > > > > > I think ultimately we need both options. I suggest we start with the > > > > > > second option, release ALPHA3 and then consider implementing the third > > > > > > option before ALPHA4 / BETA1. > > > > > > > > > > One thing to keep in mind: > > > > > It is easier to go from event driven to a blocking model than to do the > > > > > reverse. This may be an argument to go for number 3 (full nio). > > > > > If you go for 3 then make it easy to use a few selector-threads > > > > > otherwise the system will use only 1 cpu (or 1 core). > > > > > > > > > > /robo > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org > > > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org > > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org > > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org