incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jabbar Azam <aja...@gmail.com>
Subject Re: Recovering from a faulty cassandra node
Date Thu, 21 Mar 2013 21:59:00 GMT
nodetool cleanup command removes keys which can be deleted from the node
the  command is run. So I'm assuming I can run nodetool cleanup on all the
old nodes in parallel. Wouldn't do this on a live cluster as it's I/O
intensive on each node.


On 21 March 2013 17:26, Jabbar Azam <ajazam@gmail.com> wrote:

> Can I do a multiple node nodetool cleanup on my test cluster?
> On 21 Mar 2013 17:12, "Jabbar Azam" <ajazam@gmail.com> wrote:
>
>>
>> All cassandra-topology.properties are the same.
>>
>> The node add appears to be successful. I can see it using nodetool
>> status. I'm doing a node cleanup on the old nodes and then will do a node
>> remove, to remove the old node. The actual node join took about 6 hours.
>> The wiped node(now new node) has about 324 GB of files in /var/lib/cassandra
>>
>>
>>
>>
>>
>> On 21 March 2013 16:58, aaron morton <aaron@thelastpickle.com> wrote:
>>
>>>  Not sure if I needed to change cassandra-topology.properties file on
>>> the existing nodes.
>>>
>>> If you are using the PropertyFileSnitch all nodes need to have the same
>>> cassandra-topology.properties file.
>>>
>>> Cheers
>>>
>>>    -----------------
>>> Aaron Morton
>>> Freelance Cassandra Consultant
>>> New Zealand
>>>
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 21/03/2013, at 1:34 AM, Jabbar Azam <ajazam@gmail.com> wrote:
>>>
>>> I've added the node with a different IP address and after disabling the
>>> firewall data is being streamed from the existing nodes to the wiped node.
>>> I'll do a cleanup, followed by remove node once it's done.
>>>
>>> I've also added the new node to the existing nodes'
>>> cassandra-topology.properties file and restarted them. I also found I had
>>> iptables switched on and couldn't understand why the wiped node couldn't
>>> see the cluster. Not sure if I needed to change
>>> cassandra-topology.properties file on the existing nodes.
>>>
>>>
>>>
>>>
>>> On 19 March 2013 15:49, Jabbar Azam <ajazam@gmail.com> wrote:
>>>
>>>> Do I use removenode before adding the reinstalled node or after?
>>>>
>>>>
>>>> On 19 March 2013 15:45, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
>>>>
>>>>> In 1.2, you may want to use the nodetool removenode if your server i
>>>>> broken or unreachable, else I guess nodetool decommission remains the
good
>>>>> way to remove a node. (
>>>>> http://www.datastax.com/docs/1.2/references/nodetool)
>>>>>
>>>>> When this node is out, rm -rf /yourpath/cassandra/* on this serveur,
>>>>> change the configuration if needed (not sure about the auto_bootstrap
>>>>> param) and start Cassandra on that node again. It should join the ring
as a
>>>>> new node.
>>>>>
>>>>> Good luck.
>>>>>
>>>>>
>>>>> 2013/3/19 Hiller, Dean <Dean.Hiller@nrel.gov>
>>>>>
>>>>> Since you "cleared" out that node, it IS the replacement node.
>>>>>>
>>>>>> Dean
>>>>>>
>>>>>> From: Jabbar Azam <ajazam@gmail.com<mailto:ajazam@gmail.com>>
>>>>>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>>>>>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>>>>> Date: Tuesday, March 19, 2013 9:29 AM
>>>>>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
<
>>>>>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>>>>> Subject: Re: Recovering from a faulty cassandra node
>>>>>>
>>>>>> Hello Dean.
>>>>>>
>>>>>> I'm using vnodes so can't specify a token. In addition I can't follow
>>>>>> the replace node docs because I don't have a replacement node.
>>>>>>
>>>>>>
>>>>>> On 19 March 2013 15:25, Hiller, Dean <Dean.Hiller@nrel.gov<mailto:
>>>>>> Dean.Hiller@nrel.gov>> wrote:
>>>>>> I have not done this as of yet but from all that I have read your
>>>>>> best option is to follow the replace node documentation which I belive
you
>>>>>> need to
>>>>>>
>>>>>>
>>>>>>  1.  Have the token be the same BUT add 1 to it so it doesn't think
>>>>>> it's the same computer
>>>>>>  2.  Have the bootstrap option set or something so streaming takes
>>>>>> affect.
>>>>>>
>>>>>> I would however test that all out in QA to make sure it works and
if
>>>>>> you have QUOROM reads/writes a good part of that test would be to
take node
>>>>>> X down after your node Y is back in the cluster to make sure reads/writes
>>>>>> are working on the node you fixed…..you just need to make sure
node X
>>>>>> shares one of the token ranges of node Y AND your writes/reads are
in that
>>>>>> token range.
>>>>>>
>>>>>> Dean
>>>>>>
>>>>>> From: Jabbar Azam <ajazam@gmail.com<mailto:ajazam@gmail.com><mailto:
>>>>>> ajazam@gmail.com<mailto:ajazam@gmail.com>>>
>>>>>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org
>>>>>> ><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>>>>>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:
>>>>>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
>>>>>> Date: Tuesday, March 19, 2013 8:51 AM
>>>>>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org
>>>>>> ><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>>>>>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:
>>>>>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
>>>>>> Subject: Recovering from a faulty cassandra node
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am using Cassandra 1.2.2 on a 4 node test cluster with vnodes.
I
>>>>>> waited for over a week to insert lots of data into the cluster. During
the
>>>>>> end of the process one of the nodes had a hardware fault.
>>>>>>
>>>>>> I have fixed the hardware fault but the filing system on that node
is
>>>>>> corrupt so I'll have to reinstall the OS and cassandra.
>>>>>>
>>>>>> I can think of two ways of reintegrating the host into the cluster
>>>>>>
>>>>>> 1) shrink the cluster to three nodes and add the node into the cluster
>>>>>>
>>>>>> 2) Add the node into the cluster without shrinking
>>>>>>
>>>>>> I'm not sure of the best approach to take and I'm not sure how to
>>>>>> achieve each step.
>>>>>>
>>>>>> Can anybody help?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>>
>>>>>>  Jabbar Azam
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks
>>>>>>
>>>>>> Jabbar Azam
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks
>>>>
>>>> Jabbar Azam
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks
>>>
>>> Jabbar Azam
>>>
>>>
>>>
>>
>>
>> --
>> Thanks
>>
>> Jabbar Azam
>>
>


-- 
Thanks

Jabbar Azam

Mime
View raw message