cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay <vijay2...@gmail.com>
Subject Re: Docs: Token Selection
Date Thu, 16 Jun 2011 03:51:45 GMT
+1 for more documentation (I guess contributions are always welcomed).... I
will try to write it down sometime when we have a bit more time...

0.8 nodetool ring command adds the DC and RAC information....

http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
http://www.datastax.com/products/opscenter

Hope this helps...

Regards,
</VJ>



On Wed, Jun 15, 2011 at 7:24 PM, AJ <aj@dude.podzone.net> wrote:

>  Ok.  I understand the reasoning you laid out.  But, I think it should be
> documented more thoroughly.  I was trying to get an idea as to how flexible
> Cass lets you be with the various combinations of strategies, snitches,
> token ranges, etc..
>
> It would be instructional to see what a graphical representation of a
> cluster ring with multiple data centers looks like.  Google turned-up
> nothing.  I imagine it's a multilayer ring; one layer per data center with
> the nodes of one layer slightly offset from the ones in the other (based on
> the example in the wiki).  I would also like to know which node is next in
> the ring such so as to understand replica placement in, for example, the
> OldNetworkTopologyStrategy when it's doc states,
>
> "...It places one replica in a different data center from the first (if
> there is any such data center), the third replica in a different rack in the
> first datacenter, and any remaining replicas on the first unused nodes on
> the ring."
>
> I can only assume for now that "the ring" referred to is the "local" ring
> of the first data center.
>
>
>
> On 6/15/2011 5:51 PM, Vijay wrote:
>
> No it wont.... it will assume you are doing the right thing...
>
> Regards,
> </VJ>
>
>
>
> On Wed, Jun 15, 2011 at 2:34 PM, AJ <aj@dude.podzone.net> wrote:
>
>>  Vijay, thank you for your thoughtful reply.  Will Cass complain if I
>> don't setup my tokens like in the examples?
>>
>>
>> On 6/15/2011 2:41 PM, Vijay wrote:
>>
>> All you heard is right...
>> You are not overriding Cassandra's token assignment by saying here is your
>> token...
>>
>>  Logic is:
>> Calculate a token for the given key...
>> find the node in each region independently (If you use NTS and if you set
>> the strategy options which says you want to replicate to the other
>> region)...
>> Search for the ranges in each region independntly
>> Replicate the data to that node.
>>
>> For multi DC cassandra needs nodes to be equally partitioned within each
>> dc (If you care that the load equally distributed).... as well as
>> there shouldn't be any collusion of tokens within a cluster....
>>
>>  The documentation tried to explain the same and the example in the
>> documentation.
>> Hope this clarifies...
>>
>>  More examples if it helps....
>>
>>   DC1 Node 1 : token 0
>> DC1 Node 2 : token 8..
>>
>>  DC2 Node 1 : token 4..
>> DC2 Node 1 : token 12..
>>
>>  or
>>
>>  DC1 Node 1 : token 0
>> DC1 Node 2 : token 1..
>>
>>  DC2 Node 1 : token 8..
>> DC2 Node 1 : token  7..
>>
>>  Regards,
>> </VJ>
>>
>>
>>
>> On Wed, Jun 15, 2011 at 12:28 PM, AJ <aj@dude.podzone.net> wrote:
>>
>>>  On 6/15/2011 12:14 PM, Vijay wrote:
>>>
>>> Correction....
>>>
>>>  "The problem in the above approach is you have 2 nodes between 12 to 4
>>> in DC1 but from 4 to 12  you just have 1"
>>>
>>>  should be
>>>
>>>  "The problem in the above approach is you have 1 node between 0-4 (25%)
>>> and and one node covering the rest which is 4-16, 0-0 (75%)"
>>>
>>> Regards,
>>> </VJ>
>>>
>>>
>>>  Ok, I think you are saying that the computed token range intervals are
>>> incorrect and that they would be:
>>>
>>> DC1
>>> *node 1 = 0      Range: (4, 16], (0, 0]
>>>
>>> node 2 = 4      Range: (0, 4]
>>>
>>> DC2
>>>  *node 3 = 8      Range: (12, 16], (0, 8]
>>>
>>> node 4 = 12   Range: (8, 12]
>>>
>>>  If so, then yes, this is what I am seeking to confirm since I haven't
>>> found any documentation stating this directly and that reference that I gave
>>> only implies this; that is, that the token ranges are calculated per data
>>> center rather than per cluster.  I just need someone to confirm that 100%
>>> because it doesn't sound right to me based on everything else I've read.
>>>
>>> SO, the question is:  Does Cass calculate the consecutive node token
>>> ranges A.) per cluster, or B.) for the whole data center?
>>>
>>> From all I understand, the answer is B.  But, that documentation
>>> (reprinted below) implies A... or something that doesn't make sense to me
>>> because of the token placement in the example:
>>>
>>> "With NetworkTopologyStrategy, you should calculate the tokens the nodes
>>> in each DC independantly...
>>>
>>> DC1
>>> node 1 = 0
>>> node 2 = 85070591730234615865843651857942052864
>>>
>>> DC2
>>> node 3 = 1
>>> node 4 = 850705917302346158658436518579
>>> 42052865"
>>>
>>>
>>> However, I do see why this would be helpful, but first I'm just asking if this
token assignment is absolutely mandatory
>>> or if it's just a technique to achieve some end.
>>>
>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message