cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Williams <je...@wherethebitsroam.com>
Subject Re: Confusion regarding the terms "replica" and "replication factor"
Date Wed, 30 May 2012 19:14:51 GMT
First, note that replication is done at the row level, not at the node level.

That line should look more like:

placement_strategy = 'NetworkTopologyStrategy'  and strategy_options = {DC1: 1,DC2: 1,DC3:
1 }

This means that each row will have one copy in each DC and within each DC it's placement will
be according to the partitioner, so could be on any of the nodes in the each DC.

So, don't think of it as nodes replicating, but rather as how nodes should store a copy of
each row in each DC.

Also, replication does not relate the the seed nodes. Seed nodes allow the nodes to find each
other initially, but are not special otherwise - any node can be used as a seed node.

So if you had a strategy like:

placement_strategy = 'NetworkTopologyStrategy'  and strategy_options = {DC1: 3,DC2: 2,DC3:
1 }

Each row would exist on 3 of 4 nodes in DC1, 2 of 4 nodes in DC2 and on one of the nodes in
DC3. Again, with the placement in each DC due to the partitioner, based on the row key.

Jeff

On May 29, 2012, at 11:25 PM, David Fischer wrote:

> Ok now i am confused :),
> 
> ok if i have the following
> placement_strategy = 'NetworkTopologyStrategy'  and strategy_options =
> {DC1:R1,DC2:R1,DC3:R1 }
> 
> this means in each of my datacenters i will have one full replica that
> also can be seed node?
> if i have 3 node in addition to the DC replica's with normal token
> calculations a key can be in any datacenter plus on each of the
> replicas right?
> It will show 12 nodes total in its ring
> 
> On Thu, May 24, 2012 at 2:39 AM, aaron morton <aaron@thelastpickle.com> wrote:
>> This is partly historical. NTS (as it is now) has not always existed and was not
always the default. In days gone by used to be a fella could run a mighty fine key-value store
using just a Simple Replication Strategy.
>> 
>> A different way to visualise it is a single ring with a Z axis for the DC's. When
you look at the ring from the top you can see all the nodes. When you look at it from the
side you can see the nodes are on levels that correspond to their DC. Simple Strategy looks
at the ring from the top. NTS works through the layers of the ring.
>> 
>>> If the hierarchy is Cluster ->
>>> DataCenter -> Node, why exactly do we need globally unique node tokens
>>> even though nodes are at the lowest level in the hierarchy.
>> Nodes having a DC is a feature of *some* snitches and utilised by the *some* of the
replication strategies (and by the messaging system for network efficiency). For background,
mapping from row tokens to nodes is based on http://en.wikipedia.org/wiki/Consistent_hashing
>> 
>> Hope that helps.
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 24/05/2012, at 1:07 AM, java jalwa wrote:
>> 
>>> Thanks Aaron. That makes things clear.
>>> So I guess the 0 - 2^127 range for tokens corresponds to a cluster
>>> -level top-level ring. and then you add some logic on top of that with
>>> NTS to logically segment that range into sub-rings as per the notion
>>> of data clusters defined in NTS. Whats the advantage of having a
>>> single top-level ring ? intuitively it seems like each replication
>>> group could have a separate ring so that the same tokens can be
>>> assigned to nodes in different DC. If the hierarchy is Cluster ->
>>> DataCenter -> Node, why exactly do we need globally unique node tokens
>>> even though nodes are at the lowest level in the hierarchy.
>>> 
>>> Thanks again.
>>> 
>>> 
>>> On Wed, May 23, 2012 at 3:14 AM, aaron morton <aaron@thelastpickle.com>
wrote:
>>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>>> will the Node in DC3 still store the key as determined by the
>>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>>> and DC2 ?
>>>> No, only nodes in the DC's specified in the NTS configuration will be replicas.
>>>> 
>>>>> Or will the co-ordinator node be aware of the
>>>>> replica placement strategy,
>>>>> and override the partitioner's decision and walk the ring until it
>>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>>> replicas ?
>>>> The NTS considers each DC to have it's own ring. This can make token selection
in a multi DC environment confusing at times. There is something in the DS docs about it.
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 23/05/2012, at 3:16 PM, java jalwa wrote:
>>>> 
>>>>> Hi all,
>>>>>              I am a bit confused regarding the terms "replica" and
>>>>> "replication factor". Assume that I am using RandomPartitioner and
>>>>> NetworkTopologyStrategy for replica placement.
>>>>> From what I understand, with a RandomPartitioner, a row key will
>>>>> always be hashed and be stored on the node that owns the range to
>>>>> which the key is mapped.
>>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#networktopologystrategy.
>>>>> The example here, talks about having 2 data centers and a replication
>>>>> factor of 4 with 2 replicas in each datacenter, so the strategy is
>>>>> configured as DC1:2 and DC2:2. Now suppose I add another datacenter
>>>>> DC3, and do not change the NetworkTopologyStrategy.
>>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>>> will the Node in DC3 still store the key as determined by the
>>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>>> and DC2 ? Will that mean that I will then have 5 replicas in the
>>>>> cluster and not 4 ? Or will the co-ordinator node be aware of the
>>>>> replica placement strategy,
>>>>> and override the partitioner's decision and walk the ring until it
>>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>>> replicas ?
>>>>> 
>>>>> Thanks.
>>>> 
>> 


Mime
View raw message