hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: HTTP transport?
Date Sun, 11 Oct 2009 01:11:20 GMT

On 10/9/09 10:49 AM, "Doug Cutting" <cutting@apache.org> wrote:

> 
>> It is an interesting question how much we
>> depend on being able to answer queries out of order. There are some
>> parts of the code where overlapping requests from the same client
>> matter. In particular, the terasort scheduler uses threads to access the
>> namenode. That would stop providing any pipelining, which I believe
>> would be significant.
> 
> No, we wouldn't stop any pipelining, we'd just use more connections to
> implement it.  With HttpClient one can limit the number of pooled
> connnections per host:
> 

Also since HTTP supports in-order pipelining out of the box, its only
out-of-order stuff that would require additional connections.

> 
> Doug
> 

Requirements may end up ruling out HTTP, but I doubt that performance (in
the insecure case) will be the cause since there are so many high
performance client and server implementations available.
Consider something lower level than the Servlet API for the server side --
it is baggage-laden and does not allow access to all data in unconverted
form or any asynchronous i/o.

In this respect, jetty has lower level, light-weight API access points.
http://docs.codehaus.org/display/JETTY/Architecture

If HTTP is not used, I suggest a strong look at apache MINA for constructing
high performance NIO clients and servers with Java http://mina.apache.org/


Mime
View raw message