incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Subscriber <>
Subject How to scale Cassandra?
Date Mon, 04 Jul 2011 09:54:20 GMT
Hi there, 

I read a lot of Cassandra's high scalability feature: allowing seamless addition of nodes,
no downtime etc.
But I wonder how one will do this in practice in an operational system. 

In the system we're going to implement we're expecting a huge number of writes with uniformly
distributed keys 
(the keys are given and cannot be generated). That means using RandomPartitioner will (more
or less) result in 
the same work-load per node as any other OrderPreservePartitioner - right?

But how do you scale a (more or less) balanced Cassandra cluster? I think that in the end

you always have to double the number of nodes (adding just a handful of nodes disburdens only
the split regions, the
work-load of untouched regions will grow with unchanged speed).

This seems to be ok for small clusters. But what do you do with when you have several 100s
of nodes in your cluster? 
It seems to me that a balanced cluster is a bless for performance but a curse for scalability...

What are the alternatives? One could re-distribute the token ranges, but this would cause

downtimes (AFAIK); not an option!

Is there anything that I didn't understand or do I miss something else? Is the only left strategy
to make sure that
the cluster grows unbalanced so one can add nodes to the hotspots? However in this case you
have to make sure
that this strategy is lasting. Could be too optimistic...

Best Regards
View raw message