incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: UnreachableNodes
Date Fri, 19 Oct 2012 01:57:01 GMT
Cool. 

If you get it again grab nodetool gossipinfo from a few machines. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/10/2012, at 3:32 AM, Rene Kochen <Rene.Kochen@emea.schange.com> wrote:

> Thanks Aaron,
> 
> Telnet works (in both directions).
> 
> After a normal (i.e. without discarding ring state) restart of the node reporting the
other one as down, the ring shows "up" again. So a node restarts fixes the incorrect state.
> 
> I see this error occasionally.
> 
> I will further investigate and post more details when it happens again.
> 
> 2012/10/18 aaron morton <aaron@thelastpickle.com>
> You can double check the node reporting 9.109 as down can telnet to port 7000 on 9.109.

> 
> Then I would restart 9.109 with -Dcassandra.load_ring_state=false added as a JVM param
in cassandra-env.sh. 
> 
> If is still shows as down can you post the output from nodetool gossipinfo from 9.109
and the node that sees 9.109 as down. 
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/10/2012, at 8:45 PM, Rene Kochen <rene.kochen@schange.com> wrote:
> 
>> I have a four node EC2 cluster.
>> 
>> Three machines show via nodetool ring that all machines are UP.
>> One machine shows via nodetool ring that one machine is DOWN.
>> 
>> If I take a closer to the machine reporting the other machine as down, I see the
following:
>> 
>> - StorageService.UnreachableNodes = 10.49.9.109
>> - FailureDetector.SimpleStates: 10.49.9.109 = UP
>> 
>> So gossip is fine. Actually the whole 10.49.9.109 machine is fine. I see in the logging
that there is communication between 10.49.9.109 and the machine reporting it as down.
>> 
>> How or when is a node removed from the UnreachableNodes list and reported as UP again
via nodetool ring?
>> 
>> I use Cassandra 1.0.11
>> 
>> Thanks!
>> 
>> Rene
>> 
> 
> 


Mime
View raw message