Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from
	:mime-version:content-type:subject:date:in-reply-to:to
	:references:message-id; q=dns; s=thelastpickle.com; b=ipGSMaq79M
	dKhkJ/AsWpebSjHzqofNGKOwUmEBw7kzRYItVXaSve9j8obGaXU/YXE2OJldbJBX
	D1ys/+Q5BRXcIYEwYL0bUTXiEwvwno9SA4xhYB3guTjLSijWAL2aZ/j8rXIZ6UdQ
	T9NQ3pST7/CRQfNanIJIs7pZqgmfrrpMs=
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1244.3)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_B53BA189-4CA0-4543-B5B2-6B6E59797830"
Subject: Re: Completely removing a node from the cluster
Date: Tue, 23 Aug 2011 20:45:23 +1200
In-Reply-To: <076926A9-B9E1-4CC3-B858-C116C97BDE09@gmail.com>
To: user@cassandra.apache.org
References: 
 <376CEC01195C894CB9F8A3C274029A96AF25338F@FISH-EX2K10-01.azaleos.net>
 <593A1215-C630-4D6B-B905-4779389A782B@thelastpickle.com>
 <376CEC01195C894CB9F8A3C274029A96AF256B8B@FISH-EX2K10-01.azaleos.net>
 <504F4C34-7C5C-43D5-8821-18758D389F16@thelastpickle.com>
 <376CEC01195C894CB9F8A3C274029A96AF256DAD@FISH-EX2K10-01.azaleos.net>
 <376CEC01195C894CB9F8A3C274029A96AF258687@FISH-EX2K10-01.azaleos.net>
 <81FAAD69-6DA8-41A9-86E0-F5B66D55FD34@thelastpickle.com>
 <076926A9-B9E1-4CC3-B858-C116C97BDE09@gmail.com>
Message-Id: <31E4C10E-C4CD-4CE1-A1D4-61FD04FDD4CF@thelastpickle.com>


--Apple-Mail=_B53BA189-4CA0-4543-B5B2-6B6E59797830
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

I normally link to the data stax article to avoid having to actually =
write those words :)

=
http://www.datastax.com/docs/0.8/troubleshooting/index#view-of-ring-differ=
s-between-some-nodes
A
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23/08/2011, at 7:45 PM, Jonathan Colby wrote:

> I ran into this.  I also tried log_ring_state=3Dfalse which also did =
not help.   The way I got through this was to stop the entire cluster =
and start the nodes one-by-one.  =20
>=20
> I realize this is not a practical solution for everyone, but if you =
can afford to stop the cluster for a few minutes, it's worth a try.
>=20
>=20
> On Aug 23, 2011, at 9:26 AM, aaron morton wrote:
>=20
>> I'm running low on ideas for this one. Anyone else ?=20
>>=20
>> If the phantom node is not listed in the ring, other nodes should not =
be storing hints for it. You can see what nodes they are storing hints =
for via JConsole.=20
>>=20
>> You can try a rolling restart passing the JVM opt =
-Dcassandra.load_ring_state=3Dfalse However if the phantom node is been =
passed around in the gossip state it will probably just come back again.=20=

>>=20
>> Cheers
>>=20
>>=20
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>=20
>> On 23/08/2011, at 3:49 PM, Bryce Godfrey wrote:
>>=20
>>> Could this ghost node be causing my hints column family to grow to =
this size?  I also crash after about 24 hours due to commit logs growth =
taking up all the drive space.  A manual nodetool flush keeps it under =
control though.
>>>=20
>>>=20
>>>              Column Family: HintsColumnFamily
>>>              SSTable count: 6
>>>              Space used (live): 666480352
>>>              Space used (total): 666480352
>>>              Number of Keys (estimate): 768
>>>              Memtable Columns Count: 1043
>>>              Memtable Data Size: 461773
>>>              Memtable Switch Count: 3
>>>              Read Count: 38
>>>              Read Latency: 131.289 ms.
>>>              Write Count: 582108
>>>              Write Latency: 0.019 ms.
>>>              Pending Tasks: 0
>>>              Key cache capacity: 7
>>>              Key cache size: 6
>>>              Key cache hit rate: 0.8333333333333334
>>>              Row cache: disabled
>>>              Compacted row minimum size: 2816160
>>>              Compacted row maximum size: 386857368
>>>              Compacted row mean size: 120432714
>>>=20
>>> Is there a way for me to manually remove this dead node?
>>>=20
>>> -----Original Message-----
>>> From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com]=20
>>> Sent: Sunday, August 21, 2011 9:09 PM
>>> To: user@cassandra.apache.org
>>> Subject: RE: Completely removing a node from the cluster
>>>=20
>>> It's been at least 4 days now.
>>>=20
>>> -----Original Message-----
>>> From: aaron morton [mailto:aaron@thelastpickle.com]=20
>>> Sent: Sunday, August 21, 2011 3:16 PM
>>> To: user@cassandra.apache.org
>>> Subject: Re: Completely removing a node from the cluster
>>>=20
>>> I see the mistake I made about ring, gets the endpoint list from the =
same place but uses the token's to drive the whole process.=20
>>>=20
>>> I'm guessing here, don't have time to check all the code. But there =
is a 3 day timeout in the gossip system. Not sure if it applies in this =
case.=20
>>>=20
>>> Anyone know ?
>>>=20
>>> Cheers
>>>=20
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>=20
>>> On 22/08/2011, at 6:23 AM, Bryce Godfrey wrote:
>>>=20
>>>> Both .2 and .3 list the same from the mbean that Unreachable is =
empty collection, and Live node lists all 3 nodes still:
>>>> 192.168.20.2
>>>> 192.168.20.3
>>>> 192.168.20.1
>>>>=20
>>>> The removetoken was done a few days ago, and I believe the remove =
was done from .2
>>>>=20
>>>> Here is what ring outlook looks like, not sure why I get that token =
on the empty first line either:
>>>> Address         DC          Rack        Status State   Load         =
   Owns    Token
>>>>                                                                     =
       85070591730234615865843651857942052864
>>>> 192.168.20.2    datacenter1 rack1       Up     Normal  79.53 GB     =
  50.00%  0
>>>> 192.168.20.3    datacenter1 rack1       Up     Normal  42.63 GB     =
  50.00%  85070591730234615865843651857942052864
>>>>=20
>>>> Yes, both nodes show the same thing when doing a describe cluster, =
that .1 is unreachable.
>>>>=20
>>>>=20
>>>> -----Original Message-----
>>>> From: aaron morton [mailto:aaron@thelastpickle.com]=20
>>>> Sent: Sunday, August 21, 2011 4:23 AM
>>>> To: user@cassandra.apache.org
>>>> Subject: Re: Completely removing a node from the cluster
>>>>=20
>>>> Unreachable nodes in either did not respond to the message or were =
known to be down and were not sent a message.=20
>>>> The way the node lists are obtained for the ring command and =
describe cluster are the same. So it's a bit odd.=20
>>>>=20
>>>> Can you connect to JMX and have a look at the =
o.a.c.db.StorageService MBean ? What do the LiveNode and UnrechableNodes =
attributes say ?=20
>>>>=20
>>>> Also how long ago did you remove the token and on which machine? Do =
both 20.2 and 20.3 think 20.1 is still around ?=20
>>>>=20
>>>> Cheers
>>>>=20
>>>>=20
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>=20
>>>> On 20/08/2011, at 9:48 AM, Bryce Godfrey wrote:
>>>>=20
>>>>> I'm on 0.8.4
>>>>>=20
>>>>> I have removed a dead node from the cluster using nodetool =
removetoken command, and moved one of the remaining nodes to rebalance =
the tokens.  Everything looks fine when I run nodetool ring now, as it =
only lists the remaining 2 nodes and they both look fine, owning 50% of =
the tokens.
>>>>>=20
>>>>> However, I can still see it being considered as part of the =
cluster from the Cassandra-cli (192.168.20.1 being the removed node) and =
I'm worried that the cluster is still queuing up hints for the node, or =
any other issues it may cause:
>>>>>=20
>>>>> Cluster Information:
>>>>> Snitch: org.apache.cassandra.locator.SimpleSnitch
>>>>> Partitioner: org.apache.cassandra.dht.RandomPartitioner
>>>>> Schema versions:
>>>>>    dcc8f680-caa4-11e0-0000-553d4dced3ff: [192.168.20.2, =
192.168.20.3]
>>>>>    UNREACHABLE: [192.168.20.1]
>>>>>=20
>>>>>=20
>>>>> Do I need to do something else to completely remove this node?
>>>>>=20
>>>>> Thanks,
>>>>> Bryce
>>>>=20
>>>=20
>>=20
>=20


--Apple-Mail=_B53BA189-4CA0-4543-B5B2-6B6E59797830
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">I =
normally link to the data stax article to avoid having to actually write =
those words :)<div><br></div><div><a =
href=3D"http://www.datastax.com/docs/0.8/troubleshooting/index#view-of-rin=
g-differs-between-some-nodes">http://www.datastax.com/docs/0.8/troubleshoo=
ting/index#view-of-ring-differs-between-some-nodes</a></div><div>A<br><div=
>
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Cassandra Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></span>
</div>

<br><div><div>On 23/08/2011, at 7:45 PM, Jonathan Colby wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><div>I ran =
into this. &nbsp;I also tried log_ring_state=3Dfalse which also did not =
help. &nbsp;&nbsp;The way I got through this was to stop the entire =
cluster and start the nodes one-by-one. &nbsp;&nbsp;<br><br>I realize =
this is not a practical solution for everyone, but if you can afford to =
stop the cluster for a few minutes, it's worth a try.<br><br><br>On Aug =
23, 2011, at 9:26 AM, aaron morton wrote:<br><br><blockquote =
type=3D"cite">I'm running low on ideas for this one. Anyone else ? =
<br></blockquote><blockquote type=3D"cite"><br></blockquote><blockquote =
type=3D"cite">If the phantom node is not listed in the ring, other nodes =
should not be storing hints for it. You can see what nodes they are =
storing hints for via JConsole. <br></blockquote><blockquote =
type=3D"cite"><br></blockquote><blockquote type=3D"cite">You can try a =
rolling restart passing the JVM opt -Dcassandra.load_ring_state=3Dfalse =
However if the phantom node is been passed around in the gossip state it =
will probably just come back again. <br></blockquote><blockquote =
type=3D"cite"><br></blockquote><blockquote =
type=3D"cite">Cheers<br></blockquote><blockquote =
type=3D"cite"><br></blockquote><blockquote =
type=3D"cite"><br></blockquote><blockquote =
type=3D"cite">-----------------<br></blockquote><blockquote =
type=3D"cite">Aaron Morton<br></blockquote><blockquote =
type=3D"cite">Freelance Cassandra Developer<br></blockquote><blockquote =
type=3D"cite">@aaronmorton<br></blockquote><blockquote type=3D"cite"><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a><br>=
</blockquote><blockquote type=3D"cite"><br></blockquote><blockquote =
type=3D"cite">On 23/08/2011, at 3:49 PM, Bryce Godfrey =
wrote:<br></blockquote><blockquote =
type=3D"cite"><br></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Could this ghost node be causing my hints column family to =
grow to this size? &nbsp;I also crash after about 24 hours due to commit =
logs growth taking up all the drive space. &nbsp;A manual nodetool flush =
keeps it under control though.<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Column Family: =
HintsColumnFamily<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;SSTable count: 6<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Space used (live): =
666480352<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Space used (total): =
666480352<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Number of Keys (estimate): =
768<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Memtable Columns Count: =
1043<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Memtable Data Size: 461773<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Memtable Switch Count: 3<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Read Count: 38<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Read Latency: 131.289 ms.<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Write Count: 582108<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Write Latency: 0.019 ms.<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Pending Tasks: 0<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Key cache capacity: 7<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Key cache size: 6<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Key cache hit rate: =
0.8333333333333334<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Row cache: disabled<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Compacted row minimum size: =
2816160<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote=
 type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Compacted row maximum size: =
386857368<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;Compacted row mean size: =
120432714<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">Is there a way for me to =
manually remove this dead node?<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">-----Original =
Message-----<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">From: Bryce Godfrey =
[mailto:Bryce.Godfrey@azaleos.com] =
<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Sent: Sunday, August 21, 2011 9:09 =
PM<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">To: <a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a><br=
></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Subject: RE: Completely removing a node from the =
cluster<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote=
 type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">It's been at least 4 days =
now.<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">-----Original =
Message-----<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">From: aaron morton =
[mailto:aaron@thelastpickle.com] =
<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Sent: Sunday, August 21, 2011 3:16 =
PM<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">To: <a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a><br=
></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Subject: Re: Completely removing a node from the =
cluster<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote=
 type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">I see the mistake I made about =
ring, gets the endpoint list from the same place but uses the token's to =
drive the whole process. <br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">I'm guessing here, don't have =
time to check all the code. But there is a 3 day timeout in the gossip =
system. Not sure if it applies in this case. =
<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">Anyone know =
?<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite">Cheers<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite">-----------------<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">Aaron =
Morton<br></blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite">Freelance Cassandra =
Developer<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite">@aaronmorton<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a><br>=
</blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite">On 22/08/2011, at 6:23 AM, Bryce =
Godfrey wrote:<br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Both =
.2 and .3 list the same from the mbean that Unreachable is empty =
collection, and Live node lists all 3 nodes =
still:<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">192.168.20.2<br></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">192.168.20.3<br></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">192.168.20.1<br></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">The =
removetoken was done a few days ago, and I believe the remove was done =
from .2<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Here =
is what ring outlook looks like, not sure why I get that token on the =
empty first line =
either:<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Address =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DC =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Rack =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Status State &nbsp;&nbsp;Load =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Owns =
&nbsp;&nbsp;&nbsp;Token<br></blockquote></blockquote></blockquote><blockqu=
ote type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;85070591730234615865843651857942052864<br></blockquote></blockquote>=
</blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">192.168.20.2 =
&nbsp;&nbsp;&nbsp;datacenter1 rack1 =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Up &nbsp;&nbsp;&nbsp;&nbsp;Normal =
&nbsp;79.53 GB &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;50.00% =
&nbsp;0<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">192.168.20.3 &nbsp;&nbsp;&nbsp;datacenter1 rack1 =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Up &nbsp;&nbsp;&nbsp;&nbsp;Normal =
&nbsp;42.63 GB &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;50.00% =
&nbsp;85070591730234615865843651857942052864<br></blockquote></blockquote>=
</blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Yes, =
both nodes show the same thing when doing a describe cluster, that .1 is =
unreachable.<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">-----Original =
Message-----<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">From: =
aaron morton [mailto:aaron@thelastpickle.com] =
<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Sent: =
Sunday, August 21, 2011 4:23 =
AM<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">To: <a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a><br=
></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Subject:=
 Re: Completely removing a node from the =
cluster<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">Unreachable nodes in either did not respond to the message =
or were known to be down and were not sent a message. =
<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">The =
way the node lists are obtained for the ring command and describe =
cluster are the same. So it's a bit odd. =
<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Can =
you connect to JMX and have a look at the o.a.c.db.StorageService MBean =
? What do the LiveNode and UnrechableNodes attributes say ? =
<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Also =
how long ago did you remove the token and on which machine? Do both 20.2 =
and 20.3 think 20.1 is still around ? =
<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">Cheers<br></blockquote></blockquote></blockquote><blockquote=
 type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">-----------------<br></blockquote></blockquote></blockquote>=
<blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">Aaron =
Morton<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">Freelance Cassandra =
Developer<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">@aaronmorton<br></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a><br>=
</blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">On =
20/08/2011, at 9:48 AM, Bryce Godfrey =
wrote:<br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">I'm on =
0.8.4<br></blockquote></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">I have removed a dead node from =
the cluster using nodetool removetoken command, and moved one of the =
remaining nodes to rebalance the tokens. &nbsp;Everything looks fine =
when I run nodetool ring now, as it only lists the remaining 2 nodes and =
they both look fine, owning 50% of the =
tokens.<br></blockquote></blockquote></blockquote></blockquote><blockquote=
 type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">However, I can still see it =
being considered as part of the cluster from the Cassandra-cli =
(192.168.20.1 being the removed node) and I'm worried that the cluster =
is still queuing up hints for the node, or any other issues it may =
cause:<br></blockquote></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">Cluster =
Information:<br></blockquote></blockquote></blockquote></blockquote><block=
quote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">Snitch: =
org.apache.cassandra.locator.SimpleSnitch<br></blockquote></blockquote></b=
lockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite">Partitioner: =
org.apache.cassandra.dht.RandomPartitioner<br></blockquote></blockquote></=
blockquote></blockquote><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote type=3D"cite">Schema =
versions:<br></blockquote></blockquote></blockquote></blockquote><blockquo=
te type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite"> =
&nbsp;&nbsp;&nbsp;dcc8f680-caa4-11e0-0000-553d4dced3ff: [192.168.20.2, =
192.168.20.3]<br></blockquote></blockquote></blockquote></blockquote><bloc=
kquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite"> &nbsp;&nbsp;&nbsp;UNREACHABLE: =
[192.168.20.1]<br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote type=3D"cite">Do I need to do something else =
to completely remove this =
node?<br></blockquote></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote></blockquote><blo=
ckquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite">Thanks,<br></blockquote></blockquote></blockquote></blockquo=
te><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><blockquote =
type=3D"cite">Bryce<br></blockquote></blockquote></blockquote></blockquote=
><blockquote type=3D"cite"><blockquote type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote></blockquote><blockquote =
type=3D"cite"><blockquote =
type=3D"cite"><br></blockquote></blockquote><blockquote =
type=3D"cite"><br></blockquote><br></div></blockquote></div><br></div></bo=
dy></html>=

--Apple-Mail=_B53BA189-4CA0-4543-B5B2-6B6E59797830--