Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 67209 invoked from network); 8 Oct 2010 23:05:21 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Oct 2010 23:05:21 -0000 Received: (qmail 54398 invoked by uid 500); 8 Oct 2010 23:05:19 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 54373 invoked by uid 500); 8 Oct 2010 23:05:19 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 54365 invoked by uid 99); 8 Oct 2010 23:05:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Oct 2010 23:05:19 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Oct 2010 23:05:14 +0000 Received: by vws10 with SMTP id 10so622547vws.31 for ; Fri, 08 Oct 2010 16:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=/A+ara3S38E9Er+wT6qCRYcucubixbfqQ4FiuLi8bxg=; b=TkPSQmWGmjy1CtgRzfuQ+mJBA9Idcsd52rAoeurHsuZhms+pANopfKUPKkSR71AAul 8pt4R1luLv/2Pa3V0NIGpWCs6WhZPOrIxHjout1Cde8ueDDbHYayU6NOMD5epQMwS3iM yfOISywHL2Zl+xEPT8jRbnEVhcTPwxF0PXZ5w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=IYlKIdSNg5/ISU6XyKW7axaKz3zFpYRi7lCMVvqfKKBWo55kiYoy9c9AZy9fOmswHf l5A97b8NSAFR4ouDOBOJXumHz+IZg6oZj9pq0XuQeBBzFx5WtpSleZf7YgRkq1i5jkIV fEO0zAmjl5q71zLVjE2CuZfX2m5wNC9aaAnyQ= MIME-Version: 1.0 Received: by 10.220.94.21 with SMTP id x21mr890913vcm.57.1286579093478; Fri, 08 Oct 2010 16:04:53 -0700 (PDT) Received: by 10.220.176.131 with HTTP; Fri, 8 Oct 2010 16:04:53 -0700 (PDT) In-Reply-To: <61B3F92D-13B3-483E-BBD8-6C516226EBC9@gmail.com> References: <5087D823-CBDD-4126-9D36-CA0486E57D52@gmail.com> <5425ff30-ac5f-68d5-e8c7-9749cba5976f@me.com> <61B3F92D-13B3-483E-BBD8-6C516226EBC9@gmail.com> Date: Fri, 8 Oct 2010 18:04:53 -0500 Message-ID: Subject: Re: Retrieving dead node's token from system keyspace From: Jonathan Ellis To: user Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable You only need to removetoken if you want to re-replicate data to other nodes. If each node has a full copy of the data, and the other nodes have forgotten about the dead node anyway, there is no need. (If they have not forgotten about the dead node, then the token will be in the ring information.) On Fri, Oct 8, 2010 at 2:39 PM, Allan Carroll wrote: > > I had a cluster of three nodes with RF=3D3 that I was using. Then, my dem= and > dropped off quite a bit and I was trying to bring the cluster down to jus= t > one node for some time while working on other things to lower my server > costs. > Dropping the first node off the cluster worked fine using nodetool > decommission. On the second node, I forgot to decommission the node befor= e > terminating the server instance. For some reason, this caused the remaini= ng > node to stop working. So, now I have one broken node and a backup of the > data from the second node. > I'd like to just bring up the one node and get it working again. It shoul= d > have a copy of all the data since I never ran the cluster with more nodes > than the RF. > Here's some more info on where I'm at that might help. > All the servers were running 0.6.5. > This is the output I get from nodetool ring > Address =A0 =A0 =A0 Status =A0 =A0 Load =A0 =A0 =A0 =A0 =A0Range > =A0 =A0 =A0Ring > 10.202.65.143 Up =A0 =A0 =A0 =A0 27.13 GB > =A0165675654950889355108929973590945588660 =A0 =A0|<--| > I dumped the LocationInfo table and ran nodetool removetoken on anything > that looked remotely like a token. Every time, nodetool produced no outpu= t. > Except when I tried to remove the token given in the ring output. It, of > course, told me I couldn't remove the token from the local node. > I tried rebuilding the node from scratch yesterday but got only the same > results. The token shown in the ring was different, but otherwise, all > output there is the same. > The more extreme option I considered today is creating a whole new node o= n a > new server, running all the db files out to json and then importing them > into the new node. Not sure that'll be any different than what I've tried= , > but it feels like it would be as clean as I could get. > Thanks for the followups, > Allan > On Oct 7, 2010, at 7:00 PM, Matthew Dennis wrote: > > Allan, > > I'm confused on why removetoken doesn't do anything and would be interest= ed > in finding out why, but to answer your question: > > You can shutdown down your last node, nuke the system directory (make a > backup just in case), restart the node, load the schema (export it first = if > need be) and be one your way.=A0 You should end up with a node that is th= e > only one in the ring.=A0 Again, make a backup of the the system directory > (actually, might as well just backup the entire data and commitlog > directories) before you start nuking stuff. > > On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton > wrote: >> >> Allan, >> I'm a bit=A0confused=A0about what you are trying to do here. You have 2 = nodes >> with RF =3D ? , you lost one node=A0completely=A0and now you want to... >> Just get a cluster running again, don't worry about the data. >> OR >> Restore the data from the dead node. >> OR >> Create a cluster with the data from the remaining node and a new node. >> Aaron >> >> On 08 Oct, 2010,at 11:15 AM, Allan Carroll wrote: >> >> I was able to figure out to use the sstable2json tool to get the values >> out of the system keyspace. >> >> Unfortunately, the node that went down took all of it's data with it and= I >> only have access to the system keyspace of the remaining live node. Ther= e >> were only two nodes and the one left should have a whole DB copy. >> >> Running removetoken on any of the values that appeared to be tokens in t= he >> LocationInfo cf hasn't done any good. Perhaps I'm missing which value is= the >> token of the dead node? Or, is there a way to take down the last node an= d >> bring back up a new cluster using the sstables that I have on the remain= ing >> node? >> >> -Allan >> >> On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote: >> >> > Hey all, >> > >> > I had a node go down that I'm not able to get a token for from nodetoo= l >> > ring. >> > >> > The wiki says: >> > >> > "You can obtain the dead node's token by running nodetool ring on any >> > live node, unless there was some kind of outage, and the others came u= p but >> > not the down one -- in that case, you can retrieve the token from the = live >> > nodes' system tables." >> > >> > But, I can't for the life of me figure out how to get the system >> > keyspace to give up the secret. All attempts end up in: >> > >> > ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line >> > 1280) Internal error processing get_slice >> > java.lang.RuntimeException: No replica strategy configured for system >> > >> > >> > Can someone point me at a good way to get the token? >> > >> > Thanks >> > -Allan >> > > > > -- > Riptano > Software and Support for Apache Cassandra > http://www.riptano.com/ > mdennis@riptano.com > m: 512.587.0900 f: 866.583.2068 > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com