cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Fischer <fischer....@gmail.com>
Subject Re: Confusion regarding the terms "replica" and "replication factor"
Date Tue, 29 May 2012 21:25:00 GMT
Ok now i am confused :),

ok if i have the following
placement_strategy = 'NetworkTopologyStrategy'  and strategy_options =
{DC1:R1,DC2:R1,DC3:R1 }

this means in each of my datacenters i will have one full replica that
also can be seed node?
if i have 3 node in addition to the DC replica's with normal token
calculations a key can be in any datacenter plus on each of the
replicas right?
It will show 12 nodes total in its ring

On Thu, May 24, 2012 at 2:39 AM, aaron morton <aaron@thelastpickle.com> wrote:
> This is partly historical. NTS (as it is now) has not always existed and was not always
the default. In days gone by used to be a fella could run a mighty fine key-value store using
just a Simple Replication Strategy.
>
> A different way to visualise it is a single ring with a Z axis for the DC's. When you
look at the ring from the top you can see all the nodes. When you look at it from the side
you can see the nodes are on levels that correspond to their DC. Simple Strategy looks at
the ring from the top. NTS works through the layers of the ring.
>
>> If the hierarchy is Cluster ->
>> DataCenter -> Node, why exactly do we need globally unique node tokens
>> even though nodes are at the lowest level in the hierarchy.
> Nodes having a DC is a feature of *some* snitches and utilised by the *some* of the replication
strategies (and by the messaging system for network efficiency). For background, mapping from
row tokens to nodes is based on http://en.wikipedia.org/wiki/Consistent_hashing
>
> Hope that helps.
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/05/2012, at 1:07 AM, java jalwa wrote:
>
>> Thanks Aaron. That makes things clear.
>> So I guess the 0 - 2^127 range for tokens corresponds to a cluster
>> -level top-level ring. and then you add some logic on top of that with
>> NTS to logically segment that range into sub-rings as per the notion
>> of data clusters defined in NTS. Whats the advantage of having a
>> single top-level ring ? intuitively it seems like each replication
>> group could have a separate ring so that the same tokens can be
>> assigned to nodes in different DC. If the hierarchy is Cluster ->
>> DataCenter -> Node, why exactly do we need globally unique node tokens
>> even though nodes are at the lowest level in the hierarchy.
>>
>> Thanks again.
>>
>>
>> On Wed, May 23, 2012 at 3:14 AM, aaron morton <aaron@thelastpickle.com> wrote:
>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>> will the Node in DC3 still store the key as determined by the
>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>> and DC2 ?
>>> No, only nodes in the DC's specified in the NTS configuration will be replicas.
>>>
>>>> Or will the co-ordinator node be aware of the
>>>> replica placement strategy,
>>>> and override the partitioner's decision and walk the ring until it
>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>> replicas ?
>>> The NTS considers each DC to have it's own ring. This can make token selection
in a multi DC environment confusing at times. There is something in the DS docs about it.
>>>
>>> Cheers
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 23/05/2012, at 3:16 PM, java jalwa wrote:
>>>
>>>> Hi all,
>>>>              I am a bit confused regarding the terms "replica" and
>>>> "replication factor". Assume that I am using RandomPartitioner and
>>>> NetworkTopologyStrategy for replica placement.
>>>> From what I understand, with a RandomPartitioner, a row key will
>>>> always be hashed and be stored on the node that owns the range to
>>>> which the key is mapped.
>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#networktopologystrategy.
>>>> The example here, talks about having 2 data centers and a replication
>>>> factor of 4 with 2 replicas in each datacenter, so the strategy is
>>>> configured as DC1:2 and DC2:2. Now suppose I add another datacenter
>>>> DC3, and do not change the NetworkTopologyStrategy.
>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>> will the Node in DC3 still store the key as determined by the
>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>> and DC2 ? Will that mean that I will then have 5 replicas in the
>>>> cluster and not 4 ? Or will the co-ordinator node be aware of the
>>>> replica placement strategy,
>>>> and override the partitioner's decision and walk the ring until it
>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>> replicas ?
>>>>
>>>> Thanks.
>>>
>

Mime
View raw message