incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexis Lê-Quôc <...@datadoghq.com>
Subject Ghost node showing up in the ring
Date Tue, 22 Mar 2011 22:23:39 GMT
Hi,

I've seen some strange occurrence of a deleted node reappearing all of
a sudden in the ring, which leads to my question: where is the ring
structure maintained (memory with local copies?) and what prompts it
to change? I appreciate any thoughts on the events below.

I'm running 0.7.4 on 4 EC2 large machines with a replication factor of
3. On Sunday I dropped a node that was misbehaving (drained then
decommissioned). Everything was well until a few minutes ago:

On 1.2.3.47 (nevermind the temporary key imbalance)
ubuntu@YYY:~$ nodetool -h localhost ring
1.2.3.47   Up     Normal  17.89 GB        12.48%  0
1.2.3.36   Up     Normal  27.72 GB        25.00%
42535295865117307932921825928971026432
1.2.3.193  Up     Normal  42.14 GB        50.00%
127605887595351923798765477786913079296
1.2.3.252  Up     Normal  36.71 GB        12.52%
148904621249875869977532879268261763219

Then all of a sudden the node that used to sit in the middle shows up
(as "Down").
The machine itself was decommissioned over the week-end. It's
confirmed that it is not in play.

ubuntu@YYY:~$ nodetool -h localhost ring
1.2.3.47   Up     Normal  17.93 GB        12.48%  0
1.2.3.36   Up     Normal  27.76 GB        25.00%
42535295865117307932921825928971026432
2.3.4.193  Down   Normal  12.35 GB        25.00%
85070591730234615865843651857942052864
1.2.3.193  Up     Normal  42.24 GB        25.00%
127605887595351923798765477786913079296
1.2.3.252  Up     Normal  36.66 GB        12.52%
148904621249875869977532879268261763219

>From logs on each node:
2011-03-22T21:30:17.040407+00:00 Node /2.3.4.193 is now part of the cluster
2011-03-22T21:30:16.956335+00:00 Node /2.3.4.193 is now part of the cluster
2011-03-22T21:30:18.887269+00:00 Node /2.3.4.193 is now part of the cluster
2011-03-22T21:30:18.978861+00:00 Node /2.3.4.193 is now part of the cluster

(a node coming back from the dead)

On 1.2.3.193, trying to remove the ghost token...
ubuntu@XXX:~$ nodetool -h localhost ring

148904621249875869977532879268261763219
1.2.3.47   Up     Normal  17.93 GB        12.48%  0
1.2.3.36   Up     Normal  27.76 GB        25.00%
42535295865117307932921825928971026432
2.3.4.193  Down   Leaving 12.35 GB        25.00%
85070591730234615865843651857942052864
1.2.3.193  Up     Normal  52.06 GB        25.00%
127605887595351923798765477786913079296
1.2.3.252  Up     Normal  43.11 GB        12.52%
148904621249875869977532879268261763219

ubuntu@XXX:~$ nodetool -h localhost removetoken status
RemovalStatus: Removing token
(85070591730234615865843651857942052864). Waiting for replication
confirmation from [/1.2.3.193].

(wait wait wait)

ubuntu@XXX:~$ nodetool -h localhost removetoken force
RemovalStatus: Removing token
(85070591730234615865843651857942052864). Waiting for replication
confirmation from [/1.2.3.193].

(fixed)
ubuntu@XXX:~$ nodetool -h localhost ring
1.2.3.47   Up     Normal  17.93 GB        12.48%  0
1.2.3.36   Up     Normal  27.76 GB        25.00%
42535295865117307932921825928971026432
1.2.3.193  Up     Normal  53.73 GB        50.00%
127605887595351923798765477786913079296
1.2.3.252  Up     Normal  43.11 GB        12.52%
148904621249875869977532879268261763219

--
Alexis Lê-Quôc

Mime
View raw message