A restart of node1 fixed the problem.
The only thing I saw in the log of node1 before the problem was the following:
InetAddress /172.27.70.135 is now dead.
InetAddress /172.27.70.135 is now UP
After this, the nodetool ring command showed node 172.27.70.135 as dead.
You mention a “stored ring view”. Can it be that this stored ring view was out of sync with the actual (gossip) situation?
Without knowing too much more information I would try this…
* Restart node each node in turn, watch the logs to see what it says about the other.
* If that restart did not fix it, try using the Dcassandra.load_ring_state=false JVM option when starting the node. That will tell it to ignore it's stored ring view and use what gossip is telling it. Add it as a new line at the bottom of cassandra-env.sh.
If it's still failing watch the logs and see what it says when it marks the other as been down.
On 1/02/2012, at 11:12 PM, Rene Kochen wrote:
I have a cluster with seven nodes.
If I run the node-tool ring command on all nodes, I see the following:
Node1 says that node2 is down.
Node 2 says that node1 is down.
All other nodes say that everyone is up.
Is this normal behavior?
I see no network related problems. Also no problems between node1 and node2.
I use Cassandra 0.7.10