2009/11/6 Joe Stump <joe@joestump.net>

Can you explain what you mean by lack of load balancing?


Nothing in Cassandra attempts to ensure that your data are equally spread over the different nodes (yet; there are several bugs open to this effect).

If you use the OrderedPartitioner, in all likelihood your data will be very unevenly spread to the point where most of your servers aren't used at all. This obviously doesn't scale.

The RandomPartitioner is better because the hashing it does causes data to spread out, but the tokens are still chosen randomly so there's no way to guarantee that machines get equal or even similar(ish) amounts of data.

Mark