hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: HTTP transport?
Date Thu, 12 Nov 2009 22:22:42 GMT
One additional benefit of using HTTP is that people are always working 
to improve performance, and not only optimizing servers -- Google's SPDY:

http://www.readwriteweb.com/archives/spdy_google_wants_to_speed_up_the_web.php

Multiplexed requests, compressed headers, etc...

Patrick

Doug Cutting wrote:
> I'm considering an HTTP-based transport for Avro as the preferred, 
> high-performance option.
> 
> HTTP has lots of advantages.  In particular, it already has
>  - lots of authentication, authorization and encryption support;
>  - highly optimized servers;
>  - monitoring, logging, etc.
> 
> Tomcat and other servlet containers support async NIO, where a thread is 
> not required per connection.  A servlet can process bulk data with a 
> single copy to and from the socket (bypassing stream buffers).  Calls 
> can be multiplexed over a single HTTP connection using Comet events.
> 
> http://tomcat.apache.org/tomcat-6.0-doc/aio.html
> 
> Zero copy is not an option for servlets that generate arbitrary data, 
> but one can specify a file/start/length tuple and Tomcat will use 
> sendfile to write the response.  That means that while HDFS datanode 
> file reads could not be done via RPC, they could be done via HTTP with 
> zero-copy.  If authentication and authorization are already done in the 
> HTTP server, this may not be a big loss.  The HDFS client might make two 
> HTTP requests, one to read a files data, and another to read its 
> checksums.  The server would then stream the entire block to the client 
> using sendfile, using TCP flow control as today.
> 
> Thoughts?
> 
> Doug

Mime
View raw message