cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enrico Cavallin <cavallin.enr...@gmail.com>
Subject Re: Uneven token distribution with allocate_tokens_for_keyspace
Date Mon, 02 Dec 2019 11:03:31 GMT
Hi Anthony,
thank you for your hints, now the new DC is well balanced within 2%.
I did read your article, but I thought it was needed only for new
"clusters", not also for new "DCs"; but RF is per DC so it makes sense.

You TLP guys are doing a great job for Cassandra community.

Thank you,
Enrico


On Fri, 29 Nov 2019 at 05:09, Anthony Grasso <anthony.grasso@gmail.com>
wrote:

> Hi Enrico,
>
> This is a classic chicken and egg problem with the
> allocate_tokens_for_keyspace setting.
>
> The allocate_tokens_for_keyspace setting uses the replication factor of a
> DC keyspace to calculate the token allocation when a node is added to the
> cluster for the first time.
>
> Nodes need to be added to the new DC before we can replicate the keyspace
> over to it. Herein lies the problem. We are unable to use
> allocate_tokens_for_keyspace unless the keyspace is replicated to the new
> DC. In addition, as soon as you change the keyspace replication to the new
> DC, new data will start to be written to it. To work around this issue you
> will need to do the following.
>
>    1. Decommission all the nodes in the *dcNew*, one at a time.
>    2. Once all the *dcNew* nodes are decommissioned, wipe the contents in
>    the *commitlog*, *data*, *saved_caches*, and *hints* directories of
>    these nodes.
>    3. Make the first node to add into the *dcNew* a seed node. Set the
>    seed list of the first node with its IP address and the IP addresses of the
>    other seed nodes in the cluster.
>    4. Set the *initial_token* setting for the first node. You can
>    calculate the values using the algorithm in my blog post:
>    https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html.
>    For convenience I have calculated them:
>    *-9223372036854775808,-4611686018427387904,0,4611686018427387904*.
>    Note, remove the *allocate_tokens_for_keyspace* setting from the
>    *cassandra.yaml* file for this (seed) node.
>    5. Check to make sure that no other node in the cluster is assigned
>    any of the four tokens specified above. If there is another node in the
>    cluster that is assigned one of the above tokens, increment the conflicting
>    token by values of one until no other node in the cluster is assigned that
>    token value. The idea is to make sure that these four tokens are unique to
>    the node.
>    6. Add the seed node to cluster. Make sure it is listed in *dcNew *by
>    checking nodetool status.
>    7. Create a dummy keyspace in *dcNew* that has a replication factor of
>    2.
>    8. Set the *allocate_tokens_for_keyspace* value to be the name of the
>    dummy keyspace for the other two nodes you want to add to *dcNew*.
>    Note remove the *initial_token* setting for these other nodes.
>    9. Set *auto_bootstrap* to *false* for the other two nodes you want to
>    add to *dcNew*.
>    10. Add the other two nodes to the cluster, one at a time.
>    11. If you are happy with the distribution, copy the data to *dcNew*
>    by running a rebuild.
>
>
> Hope this helps.
>
> Regards,
> Anthony
>
> On Fri, 29 Nov 2019 at 02:08, Enrico Cavallin <cavallin.enrico@gmail.com>
> wrote:
>
>> Hi all,
>> I have an old datacenter with 4 nodes and 256 tokens each.
>> I am now starting a new datacenter with 3 nodes and num_token=4
>> and allocate_tokens_for_keyspace=myBiggestKeyspace in each node.
>> Both DCs run Cassandra 3.11.x.
>>
>> myBiggestKeyspace has RF=3 in dcOld and RF=2 in dcNew. Now dcNew is very
>> unbalanced.
>> Also keyspaces with RF=2 in both DCs have the same problem.
>> Did I miss something or even with  allocate_tokens_for_keyspace I have
>> strong limitations with low num_token?
>> Any suggestions on how to mitigate it?
>>
>> # nodetool status myBiggestKeyspace
>> Datacenter: dcOld
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address   Load       Tokens       Owns (effective)  Host ID
>>                     Rack
>> UN  x.x.x.x  515.83 GiB  256          76.2%
>> fc462eb2-752f-4d26-aae3-84cb9c977b8a  rack1
>> UN  x.x.x.x  504.09 GiB  256          72.7%
>> d7af8685-ba95-4854-a220-bc52dc242e9c  rack1
>> UN  x.x.x.x  507.50 GiB  256          74.6%
>> b3a4d3d1-e87d-468b-a7d9-3c104e219536  rack1
>> UN  x.x.x.x  490.81 GiB  256          76.5%
>> 41e80c5b-e4e3-46f6-a16f-c784c0132dbc  rack1
>>
>> Datacenter: dcNew
>> ==============
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address    Load       Tokens       Owns (effective)  Host ID
>>                        Rack
>> UN  x.x.x.x   145.47 KiB  4            56.3%
>> 7d089351-077f-4c36-a2f5-007682f9c215  rack1
>> UN  x.x.x.x   122.51 KiB  4            55.5%
>> 625dafcb-0822-4c8b-8551-5350c528907a  rack1
>> UN  x.x.x.x   127.53 KiB  4            88.2%
>> c64c0ce4-2f85-4323-b0ba-71d70b8e6fbf  rack1
>>
>> Thanks,
>> -- ec
>>
>

Mime
View raw message