incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Colby <jonathan.co...@gmail.com>
Subject Re: nodetool move trying to stream data to node no longer in cluster
Date Thu, 26 May 2011 07:58:09 GMT
@Aaron -

Unfortunately I'm still seeing message like:  "<ip-of-removed-node> is down", removing
from gossip, although with not the same frequency.  

And repair/move jobs don't seem to try to stream data to the removed node anymore.

Anyone know how to totally purge any stored gossip/endpoint data on nodes that were removed
from the cluster.  Or what might be happening here otherwise?


On May 26, 2011, at 9:10 AM, aaron morton wrote:

> cool. I was going to suggest that but as you already had the move running I thought it
may be a little drastic. 
> 
> Did it show any progress ? If the IP address is not responding there should have been
some sort of error. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 26 May 2011, at 15:28, jonathan.colby@gmail.com wrote:
> 
>> Seems like it had something to do with stale endpoint information. I did a rolling
restart of the whole cluster and that seemed to trigger the nodes to remove the node that
was decommissioned.
>> 
>> On , aaron morton <aaron@thelastpickle.com> wrote:
>>> Is it showing progress ? It may just be a problem with the information printed
out.
>>> 
>>> 
>>> 
>>> Can you check from the other nodes in the cluster to see if they are receiving
the stream ?
>>> 
>>> 
>>> 
>>> cheers
>>> 
>>> 
>>> 
>>> -----------------
>>> 
>>> Aaron Morton
>>> 
>>> Freelance Cassandra Developer
>>> 
>>> @aaronmorton
>>> 
>>> http://www.thelastpickle.com
>>> 
>>> 
>>> 
>>> On 26 May 2011, at 00:42, Jonathan Colby wrote:
>>> 
>>> 
>>> 
>>>> I recently removed a node (with decommission) from our cluster.
>>> 
>>>> 
>>> 
>>>> I added a couple new nodes and am now trying to rebalance the cluster using
nodetool move.
>>> 
>>>> 
>>> 
>>>> However,  netstats shows that the node being "moved" is trying to stream
data to the node that I already decommissioned yesterday.
>>> 
>>>> 
>>> 
>>>> The removed node was powered-off, taken out of dns, its IP is not even pingable.
  It was never a seed neither.
>>> 
>>>> 
>>> 
>>>> This is cassandra 0.7.5 on 64bit linux.   How do I tell the cluster that
this node is gone?  Gossip should have detected this.  The ring commands shows the correct
cluster IPs.
>>> 
>>>> 
>>> 
>>>> Here is a portion of netstats. 10.46.108.102 is the node which was removed.
>>> 
>>>> 
>>> 
>>>> Mode: Leaving: streaming data to other nodes
>>> 
>>>> Streaming to: /10.46.108.102
>>> 
>>>>  /var/lib/cassandra/data/DFS/main-f-1064-Data.db/(4681027,5195491),(5195491,15308570),(15308570,15891710),(16336750,20558705),(20558705,29112203),(29112203,36279329),(36465942,36623223),(36740457,37227058),(37227058,42206994),(42206994,47380294),(47635053,47709813),(47709813,48353944),(48621287,49406499),(53330048,53571312),(53571312,54153922),(54153922,59857615),(59857615,61029910),(61029910,61871509),(62190800,62498605),(62824281,62964830),(63511604,64353114),(64353114,64760400),(65174702,65919771),(65919771,66435630),(81440029,81725949),(81725949,83313847),(83313847,83908709),(88983863,89237303),(89237303,89934199),(89934199,97
>>> 
>>>> ...................
>>> 
>>>> 5693491,14795861666),(14795861666,14796105318),(14796105318,14796366886),(14796699825,14803874941),(14803874941,14808898331),(14808898331,14811670699),(14811670699,14815125177),(14815125177,14819765003),(14820229433,14820858266)
>>> 
>>>>        progress=280574376402/12434049900 - 2256%
>>> 
>>>> .....
>>> 
>>>> 
>>> 
>>>> 
>>> 
>>>> Note 10.46.108.102 is NOT part of the ring.
>>> 
>>>> 
>>> 
>>>> Address         Status State   Load            Owns    Token
>>> 
>>>>                                                      148873535527910577765226390751398592512
>>> 
>>>> 10.46.108.100   Up     Normal  71.73 GB        12.50%  0
>>> 
>>>> 10.46.108.101   Up     Normal  109.69 GB       12.50%  21267647932558653966460912964485513216
>>> 
>>>> 10.47.108.100   Up     Leaving 281.95 GB       37.50%  85070591730234615865843651857942052863
      
>>>> 10.47.108.102   Up     Normal  210.77 GB       0.00%   85070591730234615865843651857942052864
>>> 
>>>> 10.47.108.101   Up     Normal  289.59 GB       16.67%  113427455640312821154458202477256070484
>>> 
>>>> 10.46.108.103   Up     Normal  299.87 GB       8.33%   127605887595351923798765477786913079296
>>> 
>>>> 10.47.108.103   Up     Normal  94.99 GB        12.50%  148873535527910577765226390751398592511
>>> 
>>>> 10.46.108.104   Up     Normal  103.01 GB       0.00%   148873535527910577765226390751398592512
>>> 
>>>> 
>>> 
>>>> 
>>> 
>>>> 
>>> 
>>> 
>>> 
> 


Mime
View raw message