One of our nodes, which happens to be the seed thinks its Up and all the other nodes are down.
However all the other nodes thinks the seed is down instead. The logs for the seed node show
everything is running as it should be. I've tried restarting the node, turning on/off gossip
and thrift and nothing seems to get the node to see the rest of its ring as up and running.
I have also tried restarting one of the other nodes, which had no affect on the situation.
Below is the ring outputs for the seed and one other node in the ring, plus a ping to show
that the seed can ping the other node.
# bin/nodetool -h 0.0.0.0 ring
Address Status State Load Owns Token
141784319550391026443072753096570088105
127.0.0.1 Up Normal 4.61 GB 16.67% 0
xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621
xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242
xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863
xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484
xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105
# ping xx.xxx.30.210
PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data.
64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms
64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms
^C
--- xx.xxx.30.210 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms
# bin/nodetool -h xx.xxx.30.210 ring
Address Status State Load Owns Token
141784319550391026443072753096570088105
xx.xxx.23.40 Down Normal ? 16.67% 0
xx.xxx.30.210 Up Normal 10.58 GB 16.67% 28356863910078205288614550619314017621
xx.xx.90.87 Up Normal 10.47 GB 16.67% 56713727820156410577229101238628035242
xx.xx.22.236 Up Normal 9.63 GB 16.67% 85070591730234615865843651857942052863
xx.xx.97.96 Up Normal 10.68 GB 16.67% 113427455640312821154458202477256070484
xx.xxx.17.122 Up Normal 10.18 GB 16.67% 141784319550391026443072753096570088105
--
Ray Slakinski
|