incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: One of my nodes is in the wrong datacenter - help!
Date Mon, 10 Feb 2014 18:51:42 GMT
Maybe that node was just trying to tell you that it really  wanted to work
in a different data center :)


On Mon, Feb 10, 2014 at 10:08 AM, Sholes, Joshua <
Joshua_Sholes@cable.comcast.com> wrote:

>  In case anyone was following this issue, it ended up being something
> that looked an awful lot like CASSANDRA-6053 -- when the node was removed,
> it didn't successfully remove from the peers table from all nodes, and thus
> several of them were doing their best to try to contact it despite it being
> down.
>  --
> Josh Sholes
>
>   From: <Sholes>, Josh Sholes <Joshua_Sholes@cable.comcast.com>
> Date: Thursday, February 6, 2014 at 1:41 PM
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: One of my nodes is in the wrong datacenter - help!
>
>    Thanks for the advice.   I did use "removenode" as I was aware of the
> replace_token problems.
> I haven't run into the issue in CASSANDRA-6615 yet, and I don't believe
> I'm at risk for it.
>
>  I'm actually running into a different problem.   Having done a remove
> node on the node with the incorrect datacenter name, I am still getting
> "one or more nodes were unavailable" messages when doing queries with
> consistency=all.   I'm doing a full repair pass on the column family in
> question just to be safe (which is taking forever!) before I do anything
> else.   So to reiterate:  my cluster now shows 7 nodes up when looking with
> gossipinfo or status, but will still not do consistency=all queries.   Are
> there any best practices for finding out other issues with the cluster, or
> should I anticipate the repair pass will fix the problem?
>  --
> Josh Sholes
>
>   From: Robert Coli <rcoli@eventbrite.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Monday, February 3, 2014 at 7:30 PM
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: One of my nodes is in the wrong datacenter - help!
>
>    On Sun, Feb 2, 2014 at 10:48 AM, Sholes, Joshua <
> Joshua_Sholes@cable.comcast.com> wrote:
>
>>  I had a node in my 8-node production 1.2.8 cluster have a serious
>> problem and need to be removed and rebuilt.   However, after doing nodetool
>> removenode and then bootstrapping a new node on the same IP address, the
>> new node somehow ended up with a different datacenter name (the rest of the
>> nodes are in dc $NAME, and the new one is in dc $NAME6934724 -- as in, a
>> string of seemingly random numbers appended to the correct name).   How can
>> I force it to change DC names back to what it should be?
>>
>
>  You could change the entry in the system.local columnfamily on the
> affected node...
>
>  cqlsh > update system.local set data_center = "$NAME";
>
> ... but that is Not Supported and may have side effects of which I am not
> aware.
>
>   I'm working with 500+GB per node here so bootstrapping it again is not
>> a huge issue, but I'd prefer to avoid it anyway.  I am NOT able to change
>> the node's IP address at this time so I'm stuck with bootstrapping a new
>> node in the same place, which my gut feeling tells me might be part of the
>> problem.
>>
>
>  Note that replace_node/replace_token are broken in 1.2.8, did you
> attempt to use either of these? I presume not because you said you did
> removenode...
>
>   If I were you, I would probably removenode and re-bootstrap, as the
> safest alternative.
>
>  As an aside, while trying to deal with this issue you should be aware of
> this ticket, so you do not do the sequence of actions it describes.
>
>  https://issues.apache.org/jira/browse/CASSANDRA-6615
>
>  =Rob
>

Mime
View raw message