incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Colby <jonathan.co...@gmail.com>
Subject Re: understanding the cassandra storage scaling
Date Thu, 09 Dec 2010 11:14:28 GMT
awesome!  Thank you guys for the really quick answers and the links to
the presentations.

On Thu, Dec 9, 2010 at 12:06 PM, Sylvain Lebresne <sylvain@yakaz.com> wrote:
>> This helps a little but unfortunately I'm still a bit fuzzy for me.  So is it
>> not true that each node contains all the data in the cluster?
>
> Not at all. Basically each node is responsible of only a part of the data (a
> range really). But for each data you can choose on how many nodes it is; this
> is the Replication Factor.
>
> For instance, if you choose to have RF=1, then each piece of data will be on
> exactly one node (this is usually a bad idea since it offers very weak
> durability guarantees but nevertheless, it can be done).
>
> If you choose RF=3, each piece of data is on 3 nodes (independently of the
> number of nodes your cluster have). You can have all data on all node, but for
> that you'll have to choose RF=#{nodes in the cluster}. But this is a very
> degenerate case.
>
>> how does my query get directed to the right node?
>
> Each node in the cluster knows the ranges of data each other nodes hold. I
> suggest you watch the first video linked in this page
>  http://wiki.apache.org/cassandra/ArticlesAndPresentations
> It explains this and more.
>
> --
> Sylvain
>

Mime
View raw message