incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sholes, Joshua" <Joshua_Sho...@cable.comcast.com>
Subject Re: One of my nodes is in the wrong datacenter - help!
Date Mon, 10 Feb 2014 15:08:53 GMT
In case anyone was following this issue, it ended up being something that looked an awful lot
like CASSANDRA-6053 — when the node was removed, it didn’t successfully remove from the
peers table from all nodes, and thus several of them were doing their best to try to contact
it despite it being down.
--
Josh Sholes

From: <Sholes>, Josh Sholes <Joshua_Sholes@cable.comcast.com<mailto:Joshua_Sholes@cable.comcast.com>>
Date: Thursday, February 6, 2014 at 1:41 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: One of my nodes is in the wrong datacenter - help!

Thanks for the advice.   I did use “removenode” as I was aware of the replace_token problems.
I haven’t run into the issue in CASSANDRA-6615 yet, and I don’t believe I’m at risk
for it.

I’m actually running into a different problem.   Having done a remove node on the node with
the incorrect datacenter name, I am still getting “one or more nodes were unavailable”
messages when doing queries with consistency=all.   I’m doing a full repair pass on the
column family in question just to be safe (which is taking forever!) before I do anything
else.   So to reiterate:  my cluster now shows 7 nodes up when looking with gossipinfo or
status, but will still not do consistency=all queries.   Are there any best practices for
finding out other issues with the cluster, or should I anticipate the repair pass will fix
the problem?
--
Josh Sholes

From: Robert Coli <rcoli@eventbrite.com<mailto:rcoli@eventbrite.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday, February 3, 2014 at 7:30 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: One of my nodes is in the wrong datacenter - help!

On Sun, Feb 2, 2014 at 10:48 AM, Sholes, Joshua <Joshua_Sholes@cable.comcast.com<mailto:Joshua_Sholes@cable.comcast.com>>
wrote:
I had a node in my 8-node production 1.2.8 cluster have a serious problem and need to be removed
and rebuilt.   However, after doing nodetool removenode and then bootstrapping a new node
on the same IP address, the new node somehow ended up with a different datacenter name (the
rest of the nodes are in dc $NAME, and the new one is in dc $NAME6934724 — as in, a string
of seemingly random numbers appended to the correct name).   How can I force it to change
DC names back to what it should be?

You could change the entry in the system.local columnfamily on the affected node...

cqlsh > update system.local set data_center = "$NAME";

... but that is Not Supported and may have side effects of which I am not aware.

I’m working with 500+GB per node here so bootstrapping it again is not a huge issue, but
I’d prefer to avoid it anyway.  I am NOT able to change the node’s IP address at this
time so I’m stuck with bootstrapping a new node in the same place, which my gut feeling
tells me might be part of the problem.

Note that replace_node/replace_token are broken in 1.2.8, did you attempt to use either of
these? I presume not because you said you did removenode...

 If I were you, I would probably removenode and re-bootstrap, as the safest alternative.

As an aside, while trying to deal with this issue you should be aware of this ticket, so you
do not do the sequence of actions it describes.

https://issues.apache.org/jira/browse/CASSANDRA-6615

=Rob

Mime
View raw message