Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 18383 invoked from network); 18 May 2010 11:56:32 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 May 2010 11:56:32 -0000 Received: (qmail 23694 invoked by uid 500); 18 May 2010 11:56:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 23650 invoked by uid 500); 18 May 2010 11:56:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 23642 invoked by uid 99); 18 May 2010 11:56:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 11:56:30 +0000 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rantav@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-iw0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 11:56:23 +0000 Received: by iwn42 with SMTP id 42so1565991iwn.31 for ; Tue, 18 May 2010 04:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:from:date :message-id:subject:to:content-type; bh=3+zY7GmN/ijB4QH1epGFMU4k3ZGpe/4CBfBiwXvULuQ=; b=C3yh3cj+gPrdcVIkjRpRs/f1qPczD7CIFJeNGMfuSdOiHeyjINsTSkHI2+ucqt8rzd CshL0tV7+rusny373IYPsA6GHM7c2qyAhxOATUqztJFls88zPsVlx7ijcrGl/5PqgQSd +G95jAmLwBu6QTvGYrhAPCOyspxTRfIZ/Y7Xk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=WtjETtvLYB7cdyIaR0JmEMo0Ef5rdmhzFlos/ECof742NB1RWZxPeNf6QSQqRzG4Tq +WQA6uK+7zQa6k7FGvLsR6J8fqlk487M0MemG4M5tI1tXgIJAYEiCx+luc1sU/8CReXo G25lsLyOQzVkyzwYFsphvXN1bSdiepdaDV4x0= Received: by 10.231.169.129 with SMTP id z1mr1855834iby.26.1274183763110; Tue, 18 May 2010 04:56:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.172.83 with HTTP; Tue, 18 May 2010 04:55:43 -0700 (PDT) From: Ran Tavory Date: Tue, 18 May 2010 14:55:43 +0300 Message-ID: Subject: ConcurrentModificationException in gossiper while decommissioning another node To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e6d26d67942cbf0486dd06c5 --0016e6d26d67942cbf0486dd06c5 Content-Type: text/plain; charset=UTF-8 While the node 192.168.252.61 was in the process of decommissioning I see this error in two other nodes: INFO [Timer-1] 2010-05-18 06:01:12,048 Gossiper.java (line 179) InetAddress /192.168.252.62 is now dead. INFO [GMFD:1] 2010-05-18 06:04:00,189 Gossiper.java (line 568) InetAddress /192.168.252.62 is now UP INFO [Timer-1] 2010-05-18 06:11:45,311 Gossiper.java (line 401) FatClient / 192.168.252.61 has been silent for 3600000ms, removing from gossip ERROR [Timer-1] 2010-05-18 06:11:45,315 CassandraDaemon.java (line 88) Fatal exception in thread Thread[Timer-1,5,main] java.lang.RuntimeException: java.util.ConcurrentModificationException at org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:97) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1031) at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:382) at org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:91) ... 2 more .61 is the decommissioned node. .62 was under load (streams transferred to it from .61) I simply ran nodetool decommission on the 61 node and then (after an hour, I guess) I saw this error in two other live nodes. Does this ring any bell? It's either a bug, or that I wasn't running decommission correctly... --0016e6d26d67942cbf0486dd06c5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
While the node=C2=A0192.168.252.61=C2=A0was in t= he process of decommissioning I see this error in two other nodes:

=C2=A0INFO [Timer-1] 2010-05-18 06:01:12,048 Gossiper.java= (line 179) InetAddress /192.168.252.62 is now dead.
=C2=A0INFO [GMFD:1] 2010-05-18 06:04:00,189 Gossiper.java (line 568) I= netAddress /192.168.252.62 is now UP<= /div>
=C2=A0INFO [Timer-1] 2010-05-18 06:11:45,311 Gossiper.java (line = 401) FatClient /192.168.252.61 has be= en silent for 3600000ms, removing from gossip
ERROR [Timer-1] 2010-05-18 06:11:45,315 CassandraDaemon.java (line 88)= Fatal exception in thread Thread[Timer-1,5,main]
java.lang.Runti= meException: java.util.ConcurrentModificationException
=C2=A0=C2= =A0 =C2=A0 =C2=A0 =C2=A0at org.apache.cassandra.gms.Gossiper$GossipTimerTas= k.run(Gossiper.java:97)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at java.util.TimerThread.mainLoop(Tim= er.java:512)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at java.util.TimerT= hread.run(Timer.java:462)
Caused by: java.util.ConcurrentModifica= tionException
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at java.util.Hasht= able$Enumerator.next(Hashtable.java:1031)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at org.apache.cassandra.gms.Gossiper.= doStatusCheck(Gossiper.java:382)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2= =A0at org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:9= 1)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0... 2 more


.61 is the decommissioned node. .62 was under= load (streams transferred to it from .61)

I simpl= y ran nodetool decommission on the 61 node and then (after an hour, I guess= ) I saw this error in two other live nodes.

Does this ring any bell? It's either a bug, or that= I wasn't running=C2=A0decommission=C2=A0correctly...
--0016e6d26d67942cbf0486dd06c5--