cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Stevens <migh...@gmail.com>
Subject Re: Best Practice to add a node in a Cluster
Date Tue, 28 Apr 2015 15:24:57 GMT
I would double check in a test cluster (or with a tool like CCM to confirm
to set up a local throwaway cluster), but for this *specific* use case
(going from RF==NodeCount to RF==NodeCount with a higher number) you should
be able to have a simpler path.  Set RF=3 before you add your new node,
then add the new node.  It will bootstrap all data from the other two
nodes, then your job is done.

You shouldn't have to run repair (which you normally have to do after
increasing RF in order to make sure all nodes have their data - the nodes
already have all their data), and you shouldn't have to run cleanup (which
you normally have to do after increasing node count to instruct the old
nodes to forget data for which they are no longer responsible).  The data
responsibility hasn't changed for any node, all nodes are still responsible
for all data.

On Mon, Apr 27, 2015 at 9:19 PM, Neha Trivedi <nehajtrivedi@gmail.com>
wrote:

> Thans Arun !
>
> On Tue, Apr 28, 2015 at 9:44 AM, arun sirimalla <arunsirik@gmail.com>
> wrote:
>
>> Hi Neha,
>>
>>
>> After you add the node to the cluster, run nodetool cleanup on all nodes.
>> Next running repair on each node will replicate the data. Make sure you
>> run the repair on one node at a time, because repair is an expensive
>> process (Utilizes high CPU).
>>
>>
>>
>>
>> On Mon, Apr 27, 2015 at 8:36 PM, Neha Trivedi <nehajtrivedi@gmail.com>
>> wrote:
>>
>>> Thanks Eric and Matt :) !!
>>>
>>> Yes the purpose is to improve reliability.
>>> Right now, from our driver we are querying using degradePolicy for
>>> reliability.
>>>
>>>
>>>
>>> *For changing the keyspace for RF=3, the procedure is as under:*
>>> 1. Add a new node to the cluster (new node is not in seed list)
>>>
>>> 2. ALTER KEYSPACE system_auth WITH REPLICATION =
>>>   {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};
>>>
>>>
>>>    1. On each affected node, run nodetool repair
>>>    <http://docs.datastax.com/en/cassandra/1.2/cassandra/tools/toolsNodetool_r.html>.
>>>
>>>    2. Wait until repair completes on a node, then move to the next node.
>>>
>>>
>>> Any other things to take care?
>>>
>>> Thanks
>>> Regards
>>> neha
>>>
>>>
>>> On Mon, Apr 27, 2015 at 9:45 PM, Eric Stevens <mightye@gmail.com> wrote:
>>>
>>>> It depends on why you're adding a new node.  If you're running out of
>>>> disk space or IO capacity in your 2 node cluster, then changing RF to 3
>>>> will not improve either condition - you'd still be writing all data to all
>>>> three nodes.
>>>>
>>>> However if you're looking to improve reliability, a 2 node RF=2 cluster
>>>> cannot have either node offline without losing quorum, while a 3 node RF=3
>>>> cluster can have one node offline and still be able to achieve quorum.
>>>> RF=3 is a common replication factor because of this characteristic.
>>>>
>>>> Make sure your new node is not in its own seeds list, or it will not
>>>> bootstrap (it will come online immediately and start serving requests).
>>>>
>>>> On Mon, Apr 27, 2015 at 8:46 AM, Neha Trivedi <nehajtrivedi@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi
>>>>> We have a 2 Cluster Node with RF=2. We are planing to add a new node.
>>>>>
>>>>> Should we change RF to 3 in the schema?
>>>>> OR Just added a new node with the same RF=2?
>>>>>
>>>>> Any other Best Practice that we need to take care?
>>>>>
>>>>> Thanks
>>>>> regards
>>>>> Neha
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Arun
>> Senior Hadoop/Cassandra Engineer
>> Cloudwick
>>
>> Champion of Big Data (Cloudera)
>>
>> http://www.cloudera.com/content/dev-center/en/home/champions-of-big-data.html
>>
>> 2014 Data Impact Award Winner (Cloudera)
>>
>> http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html
>>
>>
>

Mime
View raw message