From user-return-16508-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu May 5 23:51:54 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A47F92343 for ; Thu, 5 May 2011 23:51:54 +0000 (UTC) Received: (qmail 38168 invoked by uid 500); 5 May 2011 23:51:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 38145 invoked by uid 500); 5 May 2011 23:51:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 38137 invoked by uid 99); 5 May 2011 23:51:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 May 2011 23:51:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a42.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 May 2011 23:51:45 +0000 Received: from homiemail-a42.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTP id 51D4668C05D for ; Thu, 5 May 2011 16:51:22 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=qHSikRGc96 kLQnOFi2/u/Fmt2BCX4uTkl4zCFmrS7gfgMyIdiEVBCt2Xn4n7zdDahAuP8SWkSX QanQr2sHafpt6u6M5napZz4zM3UZGyc10XTIm52HgacivlVDcaoWsMyCj/ZFcQQU t7busV5OlJhAgyKa7zZZ0hDFZ8sMGLtWI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=TxuBc4ZLRqKQbbfo MCnDkFWQk1Y=; b=wnu2kds/OrgBQtYl4yOD+IUjov0QJtM4UcOPCRI9I4ts08rG /cartTGbrvPGqZ91sSAcrcy52FHRWVE3fnqDiOUoRy4yMSlnK8/zAVkw+UM6gQbV Jm3vQjs9q0Puc7tzVEhdepTOnCwdpDN5aYP+uwEPyn3BihCZu//R72NyUN8= Received: from [10.0.1.151] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTPSA id 5DCD868C057 for ; Thu, 5 May 2011 16:51:21 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-26--85235236 Subject: Re: New node not joining Date: Fri, 6 May 2011 11:51:19 +1200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-26--85235236 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 When adding nodes it is a *very* good idea to manually set the tokens, = see http://wiki.apache.org/cassandra/Operations#Load_balancing bootstrap is a process that happens only once on a node, where as well = as telling the other nodes it's around it asks them to stream over the = data it will no be responsible for.=20 nodetool loadbalance is an old utility that should have better warnings = not to use it. The best way to load balance the cluster is manually = creating the tokens and assigning them either using the initial_token = config param or using nodetool move.=20 Hope that helps.=20 ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 May 2011, at 08:37, Sanjeev Kulkarni wrote: > Here is what I did. > I booted up the first one. After that I started the second one with = bootstrap turned off. > Then I did a nodetool loadbalance on the second node.=20 > After which I added the third node again with bootstrap turned off. = Then did the loadbalance again on the third node. > This seems to have successfully completed and I am now able to = read/write into my system. > Thanks! >=20 > On Thu, May 5, 2011 at 1:22 PM, Len Bucchino = wrote: > I just rebuilt the cluster in the same manner as I did originally = except after I setup the first node I added a keyspace and column family = before adding any new nodes. This time the 3rd node auto bootstrapped = successfully. >=20 > =20 > From: Len Bucchino [mailto:Len.Bucchino@veritix.com]=20 > Sent: Thursday, May 05, 2011 1:31 PM >=20 >=20 > To: user@cassandra.apache.org > Subject: RE: New node not joining >=20 > =20 > Also, setting auto_bootstrap to false and setting token to the one = that it said it would use in the logs allows the new node to join the = ring. >=20 > =20 > From: Len Bucchino [mailto:Len.Bucchino@veritix.com]=20 > Sent: Thursday, May 05, 2011 1:25 PM > To: user@cassandra.apache.org > Subject: RE: New node not joining >=20 > =20 > Adding the fourth node to the cluster with an empty schema using = auto_bootstrap was not successful. A nodetool netstats on the new node = shows =93Mode: Joining: getting bootstrap token=94 similar to what the = third node did before it was manually added. Also, there are no = exceptions in the logs but it never joins the ring. >=20 > =20 > From: Sanjeev Kulkarni [mailto:sanjeev@locomatix.com]=20 > Sent: Thursday, May 05, 2011 11:47 AM > To: user@cassandra.apache.org > Subject: Re: New node not joining >=20 > =20 > Hi Len, >=20 > This looks like a decent workaround. I would be very interested to see = how the addition of the 4th node went. Please post it whenever you get a = chance. >=20 > Thanks! >=20 > =20 > On Thu, May 5, 2011 at 6:47 AM, Len Bucchino = wrote: >=20 > I have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an = empty 2 node test cluster (the two nodes were manually added) and the it = currently has an empty schema. My log entries look similar to yours. I = took the new token it says its going to use from the log file added it = to the yaml and turned off auto bootstrap and the node added fine. I'm = bringing up a 4th node now and will see if it has the same problem auto = bootstrapping. >=20 > =20 > From: Sanjeev Kulkarni [sanjeev@locomatix.com] > Sent: Thursday, May 05, 2011 2:18 AM > To: user@cassandra.apache.org > Subject: New node not joining >=20 > Hey guys, >=20 > I'm running into what seems like a very basic problem. >=20 > I have a one node cassandra instance. Version 0.7.5. Freshly = installed. Contains no data. >=20 > The cassandra.yaml is the same as the default one that is supplied, = except for data/commitlog/saved_caches directories. >=20 > I also changed the addresses to point to a externally visible ip = address. >=20 > The cassandra comes up nicely and is ready to accept thrift = connections. >=20 > I do a nodetool and this is what I get. >=20 > =20 > 10.242.217.124 Up Normal 6.54 KB 100.00% = 110022862993086789903543147927259579701 >=20 > =20 > Which seems right to me. >=20 > =20 > Now I start another node. Almost identical configuration to the first = one. Except the bootstrap is turned true and seeds appropriately set. >=20 > When I start the second, I notice that the second one contacts the = first node to get the new token. >=20 > I see the following lines in the first machine(the seed machine). >=20 > =20 > INFO [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628) = Node /10.83.111.80 has restarted,=20 >=20 > now UP again >=20 > INFO [HintedHandoff:1] 2011-05-05 07:00:55,162 = HintedHandOffManager.java (line 304) Started hinted handoff for endpoint = /10.83.111.80 >=20 > INFO [HintedHandoff:1] 2011-05-05 07:00:55,164 = HintedHandOffManager.java (line 360) Finished hinted hand >=20 > off of 0 rows to endpoint /10.83.111.80 >=20 > =20 > However when i do a node ring, I still get >=20 > =20 > 10.242.217.124 Up Normal 6.54 KB 100.00% = 110022862993086789903543147927259579701 >=20 > =20 > Even though the second node has come up. On the second machine the = logs say >=20 > =20 > INFO [main] 2011-05-05 07:00:19,124 StorageService.java (line 504) = Joining: getting load information >=20 > INFO [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line = 351) Sleeping 90000 ms to wait for load information... >=20 > INFO [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) = Node /10.242.217.124 has restarted, now UP again >=20 > INFO [HintedHandoff:1] 2011-05-05 07:00:29,548 = HintedHandOffManager.java (line 304) Started hinted handoff for endpoint = /10.242.217.124 >=20 > INFO [HintedHandoff:1] 2011-05-05 07:00:29,550 = HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows = to endpoint /10.242.217.124 >=20 > INFO [main] 2011-05-05 07:01:49,137 StorageService.java (line 504) = Joining: getting bootstrap token >=20 > INFO [main] 2011-05-05 07:01:49,148 BootStrapper.java (line 148) New = token will be 24952271262852174037699496069317526837 to assume load from = /10.242.217.124 >=20 > INFO [main] 2011-05-05 07:01:49,150 Mx4jTool.java (line 72) Will not = load MX4J, mx4j-tools.jar is not in the classpath >=20 > INFO [main] 2011-05-05 07:01:49,259 CassandraDaemon.java (line 112) = Binding thrift service to /10.83.111.80:9160 >=20 > INFO [main] 2011-05-05 07:01:49,262 CassandraDaemon.java (line 126) = Using TFastFramedTransport with a max frame size of 15728640 bytes. >=20 > INFO [Thread-5] 2011-05-05 07:01:49,266 CassandraDaemon.java (line = 154) Listening for thrift clients... >=20 > =20 > This seems to indicate that the second node has joined the ring. And = has gotten its key range.=20 >=20 > Am I missing anything? >=20 >=20 > Thanks! >=20 > =20 > =20 >=20 --Apple-Mail-26--85235236 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 When = adding nodes it is a *very* good idea to manually set the tokens, = see http:/= /wiki.apache.org/cassandra/Operations#Load_balancing

bo= otstrap is a process that happens only once on a node, where as well as = telling the other nodes it's around it asks them to stream over the data = it will no be responsible for. 

nodetool = loadbalance is an old utility that should have better warnings not to = use it. The best way to load balance the cluster is manually creating = the tokens and assigning them either using the initial_token config = param or using nodetool move. 

Hope that = helps. 

http://www.thelastpickle.com

On 6 May 2011, at 08:37, Sanjeev Kulkarni wrote:

Here is = what I did.
I booted up the first one. After that I started the = second one with bootstrap turned off.
Then I did a nodetool = loadbalance on the second node. 
After which I added the = third node again with bootstrap turned off. Then did the loadbalance = again on the third node.
This seems to have successfully completed and I am now able to = read/write into my system.
Thanks!

On Thu, May 5, 2011 at 1:22 PM, Len Bucchino <Len.Bucchino@veritix.com><= /span> wrote:

I just rebuilt the cluster in = the same manner as I did originally except after I setup the first node = I added a keyspace and column family before adding any new nodes.  This time the 3rd node auto bootstrapped = successfully.

 

From: Len Bucchino [mailto:Len.Bucchino@veritix.com]
Sent: Thursday, May 05, 2011 1:31 = PM


To: user@cassandra.apache.org
Subject: RE: New node not joining

 

Also, setting auto_bootstrap to = false and setting token to the one that it said it would use in the logs = allows the new node to join the ring.

 

From: Len Bucchino [mailto:Len.Bucchino@veritix.com]
Sent: Thursday, May 05, 2011 1:25 PM
To: user@cassandra.apache.org
Subject: RE: New node not joining

 

Adding = the fourth node to the cluster with an empty schema using auto_bootstrap = was not successful.  A nodetool netstats on the new node shows = =93Mode: Joining: getting bootstrap token=94 similar to what the third node did before it = was manually added.  Also, there are no exceptions in the logs but = it never joins the ring.

 

From: Sanjeev Kulkarni [mailto:sanjeev@locomatix.com]
Sent: Thursday, May 05, 2011 11:47 AM
To: user@cassandra.apache.org
Subject: Re: New node not joining

 

Hi = Len,

This looks like a decent workaround. I would = be very interested to see how the addition of the 4th node went. Please = post it whenever you get a chance.

Thanks!

 

On Thu, May 5, 2011 at 6:47 AM, Len Bucchino = <Len.Bucchino@veritix.com> wrote:

I= have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an = empty 2 node test cluster (the two nodes were manually added) and the it = currently has an empty schema.  My log entries look similar to yours. =  I took the new token it says its going to use from the log file = added it to the yaml and turned off auto bootstrap and the node added = fine.  I'm bringing up a 4th node now and will see if it has the = same problem auto bootstrapping.

 

From: Sanjeev Kulkarni [sanjeev@locomatix.com]
Sent: Thursday, May 05, 2011 2:18 AM
To: user@cassandra.apache.org
Subject: New node not joining

Hey guys, =

I'm running into = what seems like a very basic problem.

I have a one = node cassandra instance. Version 0.7.5. Freshly installed. Contains no = data.

The = cassandra.yaml is the same as the default one that is supplied, except = for data/commitlog/saved_caches directories.

I also changed = the addresses to point to a externally visible ip address.

The cassandra = comes up nicely and is ready to accept thrift connections.

I do a nodetool = and this is what I get.

 

10.242.217.124 =  Up     Normal  6.54 KB         = 100.00% 110022862993086789903543147927259579701

 

Which seems = right to me.

 

Now I start = another node. Almost identical configuration to the first one. Except = the bootstrap is turned true and seeds appropriately set.

When I start the = second, I notice that the second one contacts the first node to get the = new token.

I see the = following lines in the first machine(the seed machine).

 

INFO = [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628) Node = /10.83.111.80 has = restarted, 

now UP = again

 INFO = [HintedHandoff:1] 2011-05-05 07:00:55,162 HintedHandOffManager.java = (line 304) Started hinted handoff for endpoint /10.83.111.80

 INFO = [HintedHandoff:1] 2011-05-05 07:00:55,164 HintedHandOffManager.java = (line 360) Finished hinted hand

off of 0 rows to = endpoint /10.83.111.80

 

However when i = do a node ring, I still get

 

10.242.217.124 =  Up     Normal  6.54 KB         = 100.00% 110022862993086789903543147927259579701

 

Even though the = second node has come up. On the second machine the logs say

 

INFO [main] = 2011-05-05 07:00:19,124 StorageService.java (line 504) Joining: getting = load information

 INFO = [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line 351) = Sleeping 90000 ms to wait for load information...

 INFO = [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) Node = /10.242.217.124 = has restarted, now UP again

 INFO = [HintedHandoff:1] 2011-05-05 07:00:29,548 HintedHandOffManager.java = (line 304) Started hinted handoff for endpoint /10.242.217.124

 INFO = [HintedHandoff:1] 2011-05-05 07:00:29,550 HintedHandOffManager.java = (line 360) Finished hinted handoff of 0 rows to endpoint /10.242.217.124

 INFO = [main] 2011-05-05 07:01:49,137 StorageService.java (line 504) Joining: = getting bootstrap token

 INFO = [main] 2011-05-05 07:01:49,148 BootStrapper.java (line 148) New token = will be 24952271262852174037699496069317526837 to assume load from /10.242.217.124

 INFO = [main] 2011-05-05 07:01:49,150 Mx4jTool.java (line 72) Will not load = MX4J, mx4j-tools.jar is not in the classpath

 INFO = [main] 2011-05-05 07:01:49,259 CassandraDaemon.java (line 112) Binding = thrift service to /10.83.111.80:9160

 INFO = [main] 2011-05-05 07:01:49,262 CassandraDaemon.java (line 126) Using = TFastFramedTransport with a max frame size of 15728640 bytes.

 INFO = [Thread-5] 2011-05-05 07:01:49,266 CassandraDaemon.java (line 154) = Listening for thrift clients...

 

This seems to = indicate that the second node has joined the ring. And has gotten its = key range. 

Am I missing = anything?


Thanks!

 
 


= --Apple-Mail-26--85235236--