From user-return-16253-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Apr 27 21:46:00 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F4EB2BF5 for ; Wed, 27 Apr 2011 21:46:00 +0000 (UTC) Received: (qmail 51983 invoked by uid 500); 27 Apr 2011 21:45:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51960 invoked by uid 500); 27 Apr 2011 21:45:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51952 invoked by uid 99); 27 Apr 2011 21:45:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Apr 2011 21:45:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a82.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Apr 2011 21:45:49 +0000 Received: from homiemail-a82.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a82.g.dreamhost.com (Postfix) with ESMTP id 7954928205F for ; Wed, 27 Apr 2011 14:45:27 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=subject :references:from:content-type:in-reply-to:message-id:date:to :content-transfer-encoding:mime-version; q=dns; s= thelastpickle.com; b=3s+DL4pez1/uSlfcNb61eriAqD1paxbeLtc8HQoR/Lt xmtSoJ5GYWVKlYy+VnMxUFlPs8uTtmN4yAmcjmxjFkJDwJzgGPKpZOffvkH5HiD3 6WP6HB/jDvlZs+1RUoSfFy8YXWtW+dWWxjZc4EhjQJH4PXEDJ1xlXYQhYNAqzRTI = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= subject:references:from:content-type:in-reply-to:message-id:date :to:content-transfer-encoding:mime-version; s=thelastpickle.com; bh=hX009SH75ZVDpeVGvPIJlOY5G0o=; b=Gf2MSLHb7wR4Pj8jJq3ElR/SWFzF SBhLPAedv0+qwjppuPNrmF0MPvVeoFGx3XQJvlXXaEjbDhpEY50Bc4H192B9COnw viiKB5er4iadGcYViQITeNr02Eg9yAAroz84gOfIogZSLqBuINyzB49AO8AxQgNZ 96BRksSOPw6ypOI= Received: from [115.189.51.101] (unknown [115.189.51.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a82.g.dreamhost.com (Postfix) with ESMTPSA id 05ECA282061 for ; Wed, 27 Apr 2011 14:45:26 -0700 (PDT) Subject: Re: Expanding single node to 2 node cluster References: <576920.89036.qm@web112010.mail.gq1.yahoo.com> From: Aaron Morton Content-Type: multipart/alternative; boundary=Apple-Mail-6--783987720 X-Mailer: iPad Mail (8H7) In-Reply-To: <576920.89036.qm@web112010.mail.gq1.yahoo.com> Message-Id: <3193A579-3B95-4786-BC49-FD14D7D7A99F@thelastpickle.com> Date: Thu, 28 Apr 2011 09:45:14 +1200 To: "user@cassandra.apache.org" Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (iPad Mail 8H7) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-6--783987720 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii You could try... - delete / move the system data directory - set the initial_token for each node to what they were before - restart and recreate the schema - run repair and then clean It would have been a good idea to drain the nodes, this would checkpoint the= logs and clear them. If you do not know the initial tokens, I would start a new empty node as sug= gested and do the the same. Hope that helps. Aaron On 27/04/2011, at 7:07 PM, maneela a wrote: > Hi, >=20 > I had a 2 node cassandra cluster with replication factor 2 and OrderPreser= vingPartitioner but we did not provide InitialToken in the configuration fil= es. One of the node was affected in the recent AWS EBS outage and had been p= artitioned from cluster. However, I continued to allowed all write operation= s to other survived node because I thought AWS could recovered EBS issues wi= th in 24 hours so Survived node might take care of propagating 2nd replica f= rom its hinted column family to the bad node when it recovered from EBS issu= es. Unfortunately AWS had taken longer than we expected almost 4 days. So in= stead of recovering 2nd node by playing hinted CF from node1, I did the foll= owing sequence of events in order to get 2nd node back to cluster >=20 > 1) shut down cassandra service on good node > 2) removed all hinted CF files > 3) Taken EBS snapshot > 4) Launched new EBS volumes from above snapshot and mounted them on 2nd no= de > 5) Also copied commitlogs from node1 to node2 > in other words, I cloned node1 and mounted on node2, my assumption is clus= ter with 2 nodes with replication 2 should likely be mirrored images >=20 > 6) brought up service on both nodes > 7) I am not seeing both IP address as part of ring when I ran nodetool com= mand >=20 > root@domU-12-31-39-0F-CA-61:/mnt/logs/cassandra# nodetool -h localhost rin= g > Address Status Load Range = Ring > 10.193.201.139Up 434.77 GB RVtMj8gWiKG0baPy = |<--| >=20 > root@ip-10-196-107-47:/data/cassandra/data/system# nodetool -h localhost r= ing > Address Status Load Range = Ring > 10.193.201.139Up 434.77 GB RVtMj8gWiKG0baPy = |<--| >=20 >=20 > I guess this behavior is happening because both nodes are having same data= including Locationinfo CF and commit logs as well. Can someone direct me wh= at should be done here to get both IPs as part of ring? >=20 > Thanks > niru > =20 --Apple-Mail-6--783987720 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
You could try...

- delete / move the system data directory
- set the initial_tok= en for each node to what they were before
- restart and recreate t= he schema
- run repair and then clean

It w= ould have been a good idea to drain the nodes, this would checkpoint the log= s and clear them.

If you do not know the initial to= kens, I would start a new empty node as suggested and do the the same.
=

Hope that helps.
Aaron

On 27/04/= 2011, at 7:07 PM, maneela a <maneel= ia@yahoo.com> wrote:

Hi,

I had a 2 node cassandra cluster with replication factor 2 and Ord= erPreservingPartitioner but we did not provide InitialToken in the conf= iguration files. One of the node was affected in the recent AWS EBS out= age and had been partitioned from cluster. However, I continued to allo= wed all write operations to other survived node because I thought AWS c= ould recovered EBS issues with in 24 hours so Survived node might take care o= f propagating 2nd replica from its hinted column family to the bad node when= it recovered from EBS issues. Unfortunately AWS had taken longer than we ex= pected almost 4 days. So instead of recovering 2nd node by playing hinted CF= from node1, I did the following sequence of events in order to get 2nd node back to c= luster

=
1) shut down cass= andra service on good node
2) removed all hinted CF files
3) Taken EBS snapshot
4) Launched new EBS volumes from above snapshot and= mounted them on 2nd node
5) Also copied commitlogs from node1 to node2
in other words, I cloned node1 and mounte= d on node2, my assumption is cluster with 2 nodes with replication 2 should l= ikely be mirrored images

6) brought up service on both nodes
7) I am not seeing both IP addr= ess as part of ring when I ran nodetool command

root@domU-12-31-39-0F-CA-61:/mnt/logs/cassandra# nodet= ool -h localhost ring
Address       Status   &= nbsp; Load          Range       &nbs= p;                     &nb= sp;        Ring
10.193.201.139Up    =     434.77 GB     RVtMj8gWiKG0baPy      =                     |<= --|

root@ip-10-196-107-47:/data/cassandra/data/system# nodetool -h local= host ring
Address       Status     Load &= nbsp;        Range           &n= bsp;                     &= nbsp;    Ring
10.193.201.139Up       &nbs= p; 434.77 GB     RVtMj8gWiKG0baPy         &nbs= p;                 |<--|


I guess this behavior is happening because bot= h nodes are having same data including Locationinfo CF and commit logs as we= ll. Can someone direct me what should be done here to get both IPs as part o= f ring?

Thanks
niru
 
= --Apple-Mail-6--783987720--