cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Retzlaff (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-1494) Gossiper ConcurrentModificationException after Decommissioning
Date Fri, 10 Sep 2010 16:49:33 GMT
Gossiper ConcurrentModificationException after Decommissioning
--------------------------------------------------------------

                 Key: CASSANDRA-1494
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1494
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.6.5
         Environment: Linux 2.6.33.8-149.fc13.x86_64 #1 SMP Tue Aug 17 22:53:15 UTC 2010 x86_64
x86_64 x86_64 GNU/Linux
            Reporter: Dan Retzlaff
            Priority: Critical


After decommissioning 192.168.2.147, the Gossiper caused a ConcurrentModificationException
in 192.168.2.55. This cascaded into 192.168.2.55 thinking that 192.168.2.148 and 192.168.2.149
repeatedly went UP and then DOWN. Eventually this left so many intranode (storage port) TCP
connections in CLOSE_WAIT that other nodes started failing with "too many open files" exceptions.

 INFO [Timer-0] 2010-09-08 17:00:02,398 Gossiper.java (line 402) FatClient /192.168.2.147
has been silent for 3600000ms, removing from gossip
ERROR [Timer-0] 2010-09-08 17:00:02,418 Gossiper.java (line 99) Gossip error
java.util.ConcurrentModificationException
        at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
        at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:383)
        at org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:93)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
 INFO [Timer-0] 2010-09-08 17:00:12,398 Gossiper.java (line 180) InetAddress /192.168.2.148
is now dead.
 INFO [Timer-0] 2010-09-08 17:00:14,399 Gossiper.java (line 180) InetAddress /192.168.2.149
is now dead.
 INFO [GMFD:1] 2010-09-08 17:00:19,400 Gossiper.java (line 578) InetAddress /192.168.2.149
is now UP
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,400 HintedHandOffManager.java (line 165)
Started hinted handoff for endPoint /192.168.2.149
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,401 HintedHandOffManager.java (line 222)
Finished hinted handoff of 0 rows to endpoint /192.168.2.149
 INFO [Timer-0] 2010-09-08 17:00:20,399 Gossiper.java (line 180) InetAddress /192.168.2.149
is now dead.
 INFO [GMFD:1] 2010-09-08 17:00:43,409 Gossiper.java (line 578) InetAddress /192.168.2.148
is now UP
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,409 HintedHandOffManager.java (line 165)
Started hinted handoff for endPoint /192.168.2.148
 INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,410 HintedHandOffManager.java (line 222)
Finished hinted handoff of 0 rows to endpoint /192.168.2.148
 INFO [Timer-0] 2010-09-08 17:00:44,404 Gossiper.java (line 180) InetAddress /192.168.2.148
is now dead.
 INFO [GMFD:1] 2010-09-08 17:01:18,415 Gossiper.java (line 578) InetAddress /192.168.2.149
is now UP

(UP/DOWN cycle repeats until the target node *really* goes DOWN due to too many TCP sockets
in CLOSE_WAIT.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message