cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Radim Kolar (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3463) cluster split due to schema disagreement
Date Mon, 07 Nov 2011 12:54:51 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145368#comment-13145368
] 

Radim Kolar commented on CASSANDRA-3463:
----------------------------------------

Now situation changed a bit:

 INFO [GossipTasks:1] 2011-11-07 13:32:54,596 Gossiper.java (line 716) InetAddress /****.99.40
is now dead.
 INFO [GossipStage:1] 2011-11-07 13:32:55,163 Gossiper.java (line 702) InetAddress /***.99.40
is now UP
 INFO [HintedHandoff:7] 2011-11-07 13:33:25,046 HintedHandOffManager.java (line 323) Started
hinted handoff for endpoint /***.99.40
 INFO [HintedHandoff:7] 2011-11-07 13:33:45,090 HintedHandOffManager.java (line 357) Could
not complete hinted handoff to /***.99.40
 INFO [HintedHandoff:7] 2011-11-07 13:33:45,090 HintedHandOffManager.java (line 379) Finished
hinted handoff of 0 rows to endpoint /****.99.40

but still one node thinks that 99.40 is unreachable, even it is up and no schema disagreement
in last hintedhandoff delivery attempt.
                
> cluster split due to schema disagreement
> ----------------------------------------
>
>                 Key: CASSANDRA-3463
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3463
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.8.7
>            Reporter: Radim Kolar
>
> i found interesting situation in 2 node cluster. Replication factor is 1.
> gossip (nodetool ring) thinks on both nodes that they are both up.
> Address         DC          Rack        Status State   Load            Owns    Token
>                                                                                99070591730234615865843651857942052864
> ****.104.18     datacenter1 rack1       Up     Normal  19.36 GB        41.77%  0
> ****.99.40    datacenter1 rack1       Up     Normal  26.24 GB        58.23%  
> one node works fine, while second thinks that other node is down even if his gossip correctly
recognizes other node as up. Problem is in schema agreement, but i dont know if logs contains
enough information to discover why nodes could not reach schema agreement.
> [default@test] describe cluster;
> Cluster Information:
>    Snitch: org.apache.cassandra.locator.SimpleSnitch
>    Partitioner: org.apache.cassandra.dht.RandomPartitioner
>    Schema versions:
>         9f2b5be0-06e2-11e1-0000-d14dd490cdf6: [****.104.18]
>         UNREACHABLE: [****.99.40]
>  INFO [GossipTasks:1] 2011-11-06 18:49:56,325 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:01,345 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:02,331 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:06,444 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:07,336 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:11,544 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:12,341 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:16,644 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:17,347 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:31,944 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
>  INFO [GossipTasks:1] 2011-11-06 18:50:32,362 Gossiper.java (line 716) InetAddress /*****99.40
is now dead.
>  INFO [GossipStage:1] 2011-11-06 18:50:37,044 Gossiper.java (line 702) InetAddress /*****99.40
is now UP
> ERROR [HintedHandoff:6] 2011-11-06 18:50:42,010 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[HintedHandoff:6,1,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema agreement
with /*****99.40 in 60000ms
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> Caused by: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40
in 60000ms
>         at org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
>         at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
>         at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
>         at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
> ERROR [HintedHandoff:6] 2011-11-06 18:50:42,028 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[HintedHandoff:6,1,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema agreement
with /*****99.40 in 60000ms
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> Caused by: java.lang.RuntimeException: Could not reach schema agreement with /*****99.40
in 60000ms
>         at org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
>         at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
>         at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
>         at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message