hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: HTTP transport?
Date Tue, 06 Oct 2009 03:19:34 GMT

On 10/5/09 1:47 PM, "Ryan Rawson" <ryanobjc@gmail.com> wrote:

> I have a question about these headers... will they impact the ability to do
> many, but small, rpcs? Imagine you'd need to support 5,000 to 50,000
> rpcs/second. Would this help or hinder?

As long as the HTTP response and request fit in one network packet
(pessimistic - 1KB or so) there is not much overhead.

50k rpcs/sec with gigabit ethernet saturated (~100MB/sec) is ~2KB per

So, on faster networks an extra 100 to 200 bytes or so won't matter.

On the WAN, it will have more of an effect if the bandwidth is low and the
latency also very low if the RPC is very 'chatty' and not 'chunky' enough.

However, on most WAN links network latency is going to kill you far, far
more than an extra 200 bytes.
For example, imagine a 20ms latency link.  The max RPC throughput to a
single client is then 50/sec (one per 20ms).  With a 1k payload per request,
that's 50k per sec max data transfer.  HTTP pipelining could help here --
but isn't as well supported as one would like.

If WAN level RPC is a goal, the main challenges there will be latency
related first, and packet size related second.
On a fast local network (gigabit) I suspect throughput problems of other
sorts to be the issue before bandwidth from slightly larger packets.

Furthermore, its not like a TCP packet is 0 bytes on its own.  HTTP adds
some overhead, but it can be kept relatively trim. 

View raw message