incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: AntiEntropyService.getNeighbors pulls information from where?
Date Tue, 13 Sep 2011 00:09:40 GMT
I'm pretty sure I'm behind on how to deal with this problem. 

Best I know is to start the node with "-Dcassandra.load_ring_state=false" as a JVM option.
But if the ghost IP address is in gossip it will not work, and it should be in gossip.

Does the ghost IP show up in nodetool ring ? 

Anyone know a way to remove a ghost IP from gossip that does not have a token associated with
it ?

Cheers
  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2011, at 6:39 AM, Sasha Dolgy wrote:

> This relates to the issue i opened the other day:
> https://issues.apache.org/jira/browse/CASSANDRA-3175 ..  basically,
> 'nodetool ring' throws an exception on two of the four nodes.
> 
> In my fancy little world, the problems appear to be related to one of
> the nodes thinking that someone is their neighbor ... and that someone
> moved away a long time ago............
> 
> /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:5]
> 2011-09-10 21:20:02,182 AntiEntropyService.java (line 658) Could not
> proceed on repair because a neighbor (/10.130.185.136) is dead:
> manual-repair-d8cdb59a-04a4-4596-b73f-cba3bd2b9eab failed.
> /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:7]
> 2011-09-11 21:20:02,258 AntiEntropyService.java (line 658) Could not
> proceed on repair because a neighbor (/10.130.185.136) is dead:
> manual-repair-ad17e938-f474-469c-9180-d88a9007b6b9 failed.
> /mnt/cassandra/logs/system.log: INFO [AntiEntropySessions:9]
> 2011-09-12 21:20:02,256 AntiEntropyService.java (line 658) Could not
> proceed on repair because a neighbor (/10.130.185.136) is dead:
> manual-repair-636150a5-4f0e-45b7-b400-24d8471a1c88 failed.
> 
> Appears only in the logs for one node that is generating the issue. 172.16.12.10
> 
> Where do I find where the AntiEntropyService.getNeighbors(tablename,
> range) is pulling it's information from?
> 
> On the two nodes that work:
> 
> [default@system] describe cluster;
> Cluster Information:
> Snitch: org.apache.cassandra.locator.Ec2Snitch
> Partitioner: org.apache.cassandra.dht.RandomPartitioner
> Schema versions:
> 1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11,
> 172.16.14.12, 172.16.14.10]
> [default@system]
> 
> From the two nodes that don't work:
> 
> [default@unknown] describe cluster;
> Cluster Information:
> Snitch: org.apache.cassandra.locator.Ec2Snitch
> Partitioner: org.apache.cassandra.dht.RandomPartitioner
> Schema versions:
> 1b871300-dbdc-11e0-0000-564008fe649f: [172.16.12.10, 172.16.12.11,
> 172.16.14.12, 172.16.14.10]
> UNREACHABLE: [10.130.185.136] --> which is really 172.16.14.10
> [default@unknown]
> 
> Really now.  Where does 10.130.185.136 exist?  It's in none of the
> configurations I have AND the full ring has been shut down and started
> up ... not trying to give Vijay a hard time by posting here btw!
> 
> Just thinking it could be something super silly ... that a wider
> audience has come across.
> 
> -- 
> Sasha Dolgy
> sasha.dolgy@gmail.com


Mime
View raw message