incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: New node not joining
Date Sun, 08 May 2011 22:40:13 GMT
Ah, I see the case you are talking about. 

If the node will auto bootstrap on startup if when it joins the ring: it is not already bootstrapped,
auto bootstrap is enabled, and the node is not in it's own seed list.

In the auto bootstrap process then finds the token it wants, but aborts the process if there
are no non system tables defined.That may happen because the bootstrap code finds the node
with the highest load and splits it's range, if all the nodes have zero load (no user data)
then that process is unreliable. But it's also unreliable if there is a schema and no data.


Created https://issues.apache.org/jira/browse/CASSANDRA-2625 to see if it can be changed.


Thanks

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 7 May 2011, at 05:25, Len Bucchino wrote:

> While I agree that what you suggested is a very good idea the bootstrapping process _should_
work properly.
>  
> Here is some additional detail on the original problem.  If the current node that you
are trying to bootstrap has itself listed in seeds in its yaml then it will be able to bootstrap
on an empty schema.  If it does not have itself listed in seeds in its yaml and you have and
empty schema then the bootstrap process will not complete and no errors will be reported in
the logs even with debug enabled.
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Thursday, May 05, 2011 6:51 PM
> To: user@cassandra.apache.org
> Subject: Re: New node not joining
>  
> When adding nodes it is a *very* good idea to manually set the tokens, see http://wiki.apache.org/cassandra/Operations#Load_balancing
>  
> bootstrap is a process that happens only once on a node, where as well as telling the
other nodes it's around it asks them to stream over the data it will no be responsible for.

>  
> nodetool loadbalance is an old utility that should have better warnings not to use it.
The best way to load balance the cluster is manually creating the tokens and assigning them
either using the initial_token config param or using nodetool move. 
>  
> Hope that helps. 
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 6 May 2011, at 08:37, Sanjeev Kulkarni wrote:
> 
> 
> Here is what I did.
> I booted up the first one. After that I started the second one with bootstrap turned
off.
> Then I did a nodetool loadbalance on the second node. 
> After which I added the third node again with bootstrap turned off. Then did the loadbalance
again on the third node.
> This seems to have successfully completed and I am now able to read/write into my system.
> Thanks!
> 
> On Thu, May 5, 2011 at 1:22 PM, Len Bucchino <Len.Bucchino@veritix.com> wrote:
> I just rebuilt the cluster in the same manner as I did originally except after I setup
the first node I added a keyspace and column family before adding any new nodes.  This time
the 3rd node auto bootstrapped successfully.
>  
> From: Len Bucchino [mailto:Len.Bucchino@veritix.com] 
> Sent: Thursday, May 05, 2011 1:31 PM
> 
> To: user@cassandra.apache.org
> Subject: RE: New node not joining
>  
>  
> Also, setting auto_bootstrap to false and setting token to the one that it said it would
use in the logs allows the new node to join the ring.
>  
> From: Len Bucchino [mailto:Len.Bucchino@veritix.com] 
> Sent: Thursday, May 05, 2011 1:25 PM
> To: user@cassandra.apache.org
> Subject: RE: New node not joining
>  
> Adding the fourth node to the cluster with an empty schema using auto_bootstrap was not
successful.  A nodetool netstats on the new node shows “Mode: Joining: getting bootstrap
token” similar to what the third node did before it was manually added.  Also, there are
no exceptions in the logs but it never joins the ring.
>  
> From: Sanjeev Kulkarni [mailto:sanjeev@locomatix.com] 
> Sent: Thursday, May 05, 2011 11:47 AM
> To: user@cassandra.apache.org
> Subject: Re: New node not joining
>  
> Hi Len,
> This looks like a decent workaround. I would be very interested to see how the addition
of the 4th node went. Please post it whenever you get a chance.
> Thanks!
>  
> On Thu, May 5, 2011 at 6:47 AM, Len Bucchino <Len.Bucchino@veritix.com> wrote:
> I have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an empty 2 node test
cluster (the two nodes were manually added) and the it currently has an empty schema.  My
log entries look similar to yours.  I took the new token it says its going to use from the
log file added it to the yaml and turned off auto bootstrap and the node added fine.  I'm
bringing up a 4th node now and will see if it has the same problem auto bootstrapping.
>  
> From: Sanjeev Kulkarni [sanjeev@locomatix.com]
> Sent: Thursday, May 05, 2011 2:18 AM
> To: user@cassandra.apache.org
> Subject: New node not joining
> 
> Hey guys,
> I'm running into what seems like a very basic problem.
> I have a one node cassandra instance. Version 0.7.5. Freshly installed. Contains no data.
> The cassandra.yaml is the same as the default one that is supplied, except for data/commitlog/saved_caches
directories.
> I also changed the addresses to point to a externally visible ip address.
> The cassandra comes up nicely and is ready to accept thrift connections.
> I do a nodetool and this is what I get.
>  
> 10.242.217.124  Up     Normal  6.54 KB         100.00% 110022862993086789903543147927259579701
>  
> Which seems right to me.
>  
> Now I start another node. Almost identical configuration to the first one. Except the
bootstrap is turned true and seeds appropriately set.
> When I start the second, I notice that the second one contacts the first node to get
the new token.
> I see the following lines in the first machine(the seed machine).
>  
> INFO [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628) Node /10.83.111.80
has restarted, 
> now UP again
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,162 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.83.111.80
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,164 HintedHandOffManager.java (line 360)
Finished hinted hand
> off of 0 rows to endpoint /10.83.111.80
>  
> However when i do a node ring, I still get
>  
> 10.242.217.124  Up     Normal  6.54 KB         100.00% 110022862993086789903543147927259579701
>  
> Even though the second node has come up. On the second machine the logs say
>  
> INFO [main] 2011-05-05 07:00:19,124 StorageService.java (line 504) Joining: getting load
information
>  INFO [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line 351) Sleeping 90000
ms to wait for load information...
>  INFO [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) Node /10.242.217.124
has restarted, now UP again
>  INFO [HintedHandoff:1] 2011-05-05 07:00:29,548 HintedHandOffManager.java (line 304)
Started hinted handoff for endpoint /10.242.217.124
>  INFO [HintedHandoff:1] 2011-05-05 07:00:29,550 HintedHandOffManager.java (line 360)
Finished hinted handoff of 0 rows to endpoint /10.242.217.124
>  INFO [main] 2011-05-05 07:01:49,137 StorageService.java (line 504) Joining: getting
bootstrap token
>  INFO [main] 2011-05-05 07:01:49,148 BootStrapper.java (line 148) New token will be 24952271262852174037699496069317526837
to assume load from /10.242.217.124
>  INFO [main] 2011-05-05 07:01:49,150 Mx4jTool.java (line 72) Will not load MX4J, mx4j-tools.jar
is not in the classpath
>  INFO [main] 2011-05-05 07:01:49,259 CassandraDaemon.java (line 112) Binding thrift service
to /10.83.111.80:9160
>  INFO [main] 2011-05-05 07:01:49,262 CassandraDaemon.java (line 126) Using TFastFramedTransport
with a max frame size of 15728640 bytes.
>  INFO [Thread-5] 2011-05-05 07:01:49,266 CassandraDaemon.java (line 154) Listening for
thrift clients...
>  
> This seems to indicate that the second node has joined the ring. And has gotten its key
range. 
> Am I missing anything?
> 
> Thanks!
>  
>  
>  
>  


Mime
View raw message