cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Robson <mar...@gmail.com>
Subject Re: using cassandra as a real time DW
Date Fri, 06 Nov 2009 21:35:19 GMT
2009/11/6 Joe Stump <joe@joestump.net>

>
> Can you explain what you mean by lack of load balancing?
>


Nothing in Cassandra attempts to ensure that your data are equally spread
over the different nodes (yet; there are several bugs open to this effect).

If you use the OrderedPartitioner, in all likelihood your data will be very
unevenly spread to the point where most of your servers aren't used at all.
This obviously doesn't scale.

The RandomPartitioner is better because the hashing it does causes data to
spread out, but the tokens are still chosen randomly so there's no way to
guarantee that machines get equal or even similar(ish) amounts of data.

Mark

Mime
View raw message