cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jji...@gmail.com>
Subject Re: does c* 3.0 use one ring for all datacenters?
Date Sat, 28 Apr 2018 15:40:49 GMT
If you tried this it’d probably fail in an unpleasant way

Tokens never move automatically. We add new tokens. Administrators can move tokens. Cassandra
doesn’t auto-move tokens.

-- 
Jeff Jirsa


> On Apr 28, 2018, at 3:05 AM, Jinhua Luo <luajit.io@gmail.com> wrote:
> 
> If two DC are separated at first place before they meet each other.
> And given the total token range is 100, and each DC has same tokens,
> let's say 5.
> Then because they assign tokens independently, they will has same
> token ranges, right?
> For example,
> DC1 = {0, 20, 40, 60, 80}
> DC2 = {0, 20, 40, 60, 80}
> 
> Then, when the DC meet each other, they should merge two rings into one, right?
> Here are the questions:
> a) who does the merge?
> b) the tokens change after merge?
> 
> 
> 
> 2018-04-27 1:51 GMT+08:00 Jeff Jirsa <jjirsa@gmail.com>:
>> 
>> 
>>> On Thu, Apr 26, 2018 at 1:34 AM, Jinhua Luo <luajit.io@gmail.com> wrote:
>>> 
>>> How to guarantee the tokens independent between DC?
>> 
>> 
>> Cassandra wont let you have duplicate tokens - it wont start if you do it by
>> mistake, and it won't do it automatically.
>> 
>>> 
>>> They forms one
>>> ring, and they must be (re-)assigned when needed.
>> 
>> 
>> Tokens dont move automatically. There's no auto-reassignment. You can move a
>> token, but nothing does it automatically.
>> 
>>> 
>>> Use offset per DC? But it seems that the DC list must be fixed in
>>> advanced?
>>> To make sure the tokens are evenly distributed into the ring among the
>>> DC(s), are there chances to change the tokens owned by per DC?
>>> Could you please give a detailed token re-balancing procedure in case
>>> of node add/remove?
>> 
>> 
>> Calculate final state. Run repair and cleanup. Move tokens as needed. If
>> you're not able to reason through this, you may want to consider using
>> vnodes so it becomes less of an issue.
>> 
>>> 
>>> 
>>> 2018-04-26 16:23 GMT+08:00 Xiaolong Jiang <xiaolong302@gmail.com>:
>>>> DC are independent of each other. Adding nodes to DC1  won't have any
>>>> token
>>>> effect owned by other DC.
>>>> 
>>>>> On Thu, Apr 26, 2018 at 1:04 AM, Jinhua Luo <luajit.io@gmail.com>
wrote:
>>>>> 
>>>>> You're assuming per DC has same total num_tokens, right?
>>>>> If I add a new node into DC1, will it change the tokens owned by DC2
>>>>> and
>>>>> DC3?
>>>>> 
>>>>> 2018-04-12 0:59 GMT+08:00 Jeff Jirsa <jjirsa@gmail.com>:
>>>>>> When you add DC3, they'll get tokens (that aren't currently in use
in
>>>>>> any
>>>>>> existing DC). Either you assign tokens (let's pretend we manually
>>>>>> assigned
>>>>>> the other ones, since DC2 = DC1 + 1), but cassandra can also
>>>>>> auto-calculate
>>>>>> them, the exact behavior of which varies by version.
>>>>>> 
>>>>>> 
>>>>>> Let's pretend it's old style random assignment, and we end up with
>>>>>> DC3
>>>>>> having 4, 17, 22, 36, 48, 53, 64, 73, 83
>>>>>> 
>>>>>> In this case:
>>>>>> 
>>>>>> If you use SimpleStrategy and RF=3, a key with token 5 would be
>>>>>> placed
>>>>>> on
>>>>>> the hosts with token 10, 11, 17
>>>>>> If you use NetworkTopologyStrategy with RF=3 per DC, a key with token
>>>>>> 5
>>>>>> would be placed on the hosts with tokens 10,20,30 ; 11, 21,31 ; 17,
>>>>>> 22,
>>>>>> 36
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Apr 11, 2018 at 9:36 AM, Jinhua Luo <luajit.io@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> What if I add a new DC3?
>>>>>>> The token ranges would reshuffled into DC1, DC2, DC3?
>>>>>>> 
>>>>>>> 2018-04-11 22:06 GMT+08:00 Jeff Jirsa <jjirsa@gmail.com>:
>>>>>>>> Confirming again that it's definitely one ring.
>>>>>>>> 
>>>>>>>> DC1 may have tokens 0, 10, 20, 30, 40, 50, 60, 70, 80
>>>>>>>> DC2 may have tokens 1, 11, 21, 31, 41, 51, 61, 71, 81
>>>>>>>> 
>>>>>>>> If you use SimpleStrategy and RF=3, a key with token 5 would
be
>>>>>>>> placed
>>>>>>>> on
>>>>>>>> the hosts with token 10, 11, 20
>>>>>>>> If you use NetworkTopologyStrategy with RF=3 per DC, a key
with
>>>>>>>> token
>>>>>>>> 5
>>>>>>>> would be placed on the hosts with tokens 10,20,30 and 11,
21,31
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Apr 11, 2018 at 6:27 AM, Jinhua Luo <luajit.io@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Is it a different answer? One ring?
>>>>>>>>> 
>>>>>>>>> Could you explain your answer according to my example?
>>>>>>>>> 
>>>>>>>>> 2018-04-11 21:24 GMT+08:00 Jonathan Haddad <jon@jonhaddad.com>:
>>>>>>>>>> There has always been a single ring.
>>>>>>>>>> 
>>>>>>>>>> You can specify how many nodes in each DC you want
and it’ll
>>>>>>>>>> figure
>>>>>>>>>> out
>>>>>>>>>> how
>>>>>>>>>> to do it as long as you have the right snitch and
are using
>>>>>>>>>> NetworkToploogyStrategy.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, Apr 11, 2018 at 6:11 AM Jinhua Luo
>>>>>>>>>> <luajit.io@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Let me clarify my question:
>>>>>>>>>>> 
>>>>>>>>>>> Given we have a cluster of two DCs, each DC has
2 nodes, each
>>>>>>>>>>> node
>>>>>>>>>>> sets num_token as 50.
>>>>>>>>>>> Then how are token ranges distributed in the
cluster?
>>>>>>>>>>> 
>>>>>>>>>>> If there is one global ring, then it may be (To
simply the
>>>>>>>>>>> case,
>>>>>>>>>>> let's
>>>>>>>>>>> assume vnodes=1):
>>>>>>>>>>> {dc1, node1} 1-50
>>>>>>>>>>> {dc2, node1} 51-100
>>>>>>>>>>> {dc1, node1} 101-150
>>>>>>>>>>> {dc1, node2} 151-200
>>>>>>>>>>> 
>>>>>>>>>>> But here comes more questions:
>>>>>>>>>>> a) what if I add a new datacenter? Then the token
ranges need
>>>>>>>>>>> to
>>>>>>>>>>> be
>>>>>>>>>>> re-balanced?
>>>>>>>>>>> If so, what about the data associated with the
ranges to be
>>>>>>>>>>> balanced?
>>>>>>>>>>> move them among DCs?
>>>>>>>>>>> But that doesn't make sense, because each keyspace
would
>>>>>>>>>>> specify
>>>>>>>>>>> its
>>>>>>>>>>> snith and fix the DCs to store then.
>>>>>>>>>>> 
>>>>>>>>>>> b) It seems no benefits from same ring, because
of the snith.
>>>>>>>>>>> 
>>>>>>>>>>> If each DC has own ring, then it may be:
>>>>>>>>>>> {dc1, node1} 1-50
>>>>>>>>>>> {dc1, node1} 51-100
>>>>>>>>>>> {dc2, node1} 1-50
>>>>>>>>>>> {dc2, node1} 51-100
>>>>>>>>>>> 
>>>>>>>>>>> I think this is not a trivial question, because
each key would
>>>>>>>>>>> be
>>>>>>>>>>> hashed to determine the token it belongs to,
and
>>>>>>>>>>> the token range distribution in turns determine
which node the
>>>>>>>>>>> key
>>>>>>>>>>> belongs
>>>>>>>>>>> to.
>>>>>>>>>>> 
>>>>>>>>>>> Any official answer?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2018-04-11 20:54 GMT+08:00 Jacques-Henri Berthemet
>>>>>>>>>>> <jacques-henri.berthemet@genesys.com>:
>>>>>>>>>>>> Maybe I misunderstood something but from
what I understand,
>>>>>>>>>>>> each
>>>>>>>>>>>> DC
>>>>>>>>>>>> have
>>>>>>>>>>>> the same ring (0-100 in you example) but
it's split
>>>>>>>>>>>> differently
>>>>>>>>>>>> between
>>>>>>>>>>>> nodes in each DC. I think it's the same principle
if using
>>>>>>>>>>>> vnode
>>>>>>>>>>>> or
>>>>>>>>>>>> not.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think the confusion comes from the fact
that the ring
>>>>>>>>>>>> range
>>>>>>>>>>>> is
>>>>>>>>>>>> the
>>>>>>>>>>>> same (0-100) but each DC manages it differently
because
>>>>>>>>>>>> nodes
>>>>>>>>>>>> are
>>>>>>>>>>>> different.
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Jacques-Henri Berthemet
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>>>>>>>>>>>> Sent: Wednesday, April 11, 2018 2:26 PM
>>>>>>>>>>>> To: user@cassandra.apache.org
>>>>>>>>>>>> Subject: Re: does c* 3.0 use one ring for
all datacenters?
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for your reply. I also think separate
rings are more
>>>>>>>>>>>> reasonable.
>>>>>>>>>>>> 
>>>>>>>>>>>> So one ring for one dc is only for c* 1.x
or 2.x without
>>>>>>>>>>>> vnode?
>>>>>>>>>>>> 
>>>>>>>>>>>> Check these references:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> https://docs.datastax.com/en/archived/cassandra/1.1/docs/initialize/token_generation.html
>>>>>>>>>>>> http://www.luketillman.com/one-token-ring-to-rule-them-all/
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> https://community.apigee.com/articles/13096/cassandra-token-distribution.html
>>>>>>>>>>>> 
>>>>>>>>>>>> Even the riak official said c* splits the
ring across dc:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> http://basho.com/posts/business/riak-vs-cassandra-an-updated-brief-comparison/
>>>>>>>>>>>> 
>>>>>>>>>>>> Why they said each dc has its own ring?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 2018-04-11 19:55 GMT+08:00 Jacques-Henri
Berthemet
>>>>>>>>>>>> <jacques-henri.berthemet@genesys.com>:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Each DC has the whole ring, each DC contains
a copy of the
>>>>>>>>>>>>> same
>>>>>>>>>>>>> data.
>>>>>>>>>>>>> When you add replication to a new DC,
all data is copied to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> new
>>>>>>>>>>>>> DC.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Within a DC, each range of token is 'owned'
by a (primary)
>>>>>>>>>>>>> node
>>>>>>>>>>>>> (and
>>>>>>>>>>>>> replicas if you have RF > 1). If you
add/remove a node in a
>>>>>>>>>>>>> DC,
>>>>>>>>>>>>> tokens will
>>>>>>>>>>>>> be rearranged between all nodes within
the DC only, the
>>>>>>>>>>>>> other
>>>>>>>>>>>>> DCs
>>>>>>>>>>>>> won't be
>>>>>>>>>>>>> affected.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Jacques-Henri Berthemet
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Jinhua Luo [mailto:luajit.io@gmail.com]
>>>>>>>>>>>>> Sent: Wednesday, April 11, 2018 12:35
PM
>>>>>>>>>>>>> To: user@cassandra.apache.org
>>>>>>>>>>>>> Subject: does c* 3.0 use one ring for
all datacenters?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I know it seems a stupid question, but
I am really confused
>>>>>>>>>>>>> about
>>>>>>>>>>>>> the
>>>>>>>>>>>>> documents on the internet related to
this topic, especially
>>>>>>>>>>>>> it
>>>>>>>>>>>>> seems
>>>>>>>>>>>>> that it
>>>>>>>>>>>>> has different answers for c* with vnodes
or not.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Let's assume the token range is 1-100
for the whole
>>>>>>>>>>>>> cluster,
>>>>>>>>>>>>> how
>>>>>>>>>>>>> does
>>>>>>>>>>>>> it distributed into the datacenters?
Think that the number
>>>>>>>>>>>>> of
>>>>>>>>>>>>> datacenters is
>>>>>>>>>>>>> dynamic in a cluster, if there is only
one ring, then the
>>>>>>>>>>>>> token
>>>>>>>>>>>>> range would
>>>>>>>>>>>>> change on each node when I add a new
datacenter into the
>>>>>>>>>>>>> cluster?
>>>>>>>>>>>>> Then it
>>>>>>>>>>>>> would involve data migration? It doesn't
make sense.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Looking forward to clarification for
c* 3.0, thanks!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>> user-unsubscribe@cassandra.apache.org
>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>> user-help@cassandra.apache.org
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> Xiaolong Jiang
>>>> 
>>>> Software Engineer at Apple
>>>> Columbia University
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Mime
View raw message