incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@digg.com>
Subject Re: how large can a cluster over the WAN be?
Date Tue, 08 Mar 2011 07:52:09 GMT
On Mon, Mar 7, 2011 at 11:32 AM, John Lewis <lewilists@gmail.com> wrote:
> When you say decent latency and throughput what numbers do you consider decent? I know
throughput would be highly dependent on the quantity of kb shoved through the pipe so I would
expect throughput needs would be highly dependent on the data actually in cassandra.

As you say, throughput needed is dependent on Cassandra payload size,
but also (in 0.7) read repair percentage. Cassandra is a large
consumer of network traffic relative to the amount of data serviced to
clients due to background repair processes like read repair and manual
AES repair. There are obviously scenarios where you might saturate the
WAN link given large enough nodes or numbers of nodes per datacenter..

When I am talking about latency, my experience is with WAN latency
under 100ms and without DynamicEndpointSnitch. I suspect that within
an order of magnitude of that latency, with or without DES, is likely
to be fine for many use cases. There are a few tunables  which might
be appropriate to increase when operating in more than two datacenters
with greater possible latency between any two as well as replication
strategies and consistency levels which offer certain latency
behavior. As always, simulating your actual workload is likely to give
you the most relevant information as to the impact of inter-cassandra
latency on your application. :)

=Rob

Mime
View raw message