Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 27269 invoked from network); 24 Nov 2009 17:22:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Nov 2009 17:22:21 -0000 Received: (qmail 95726 invoked by uid 500); 24 Nov 2009 17:22:21 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 95674 invoked by uid 500); 24 Nov 2009 17:22:20 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 95665 invoked by uid 99); 24 Nov 2009 17:22:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 17:22:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 209.85.219.212 as permitted sender) Received: from [209.85.219.212] (HELO mail-ew0-f212.google.com) (209.85.219.212) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 17:22:10 +0000 Received: by ewy4 with SMTP id 4so3494431ewy.27 for ; Tue, 24 Nov 2009 09:21:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=wNSOtti8SxfZDOP5H+YdPO7nsguh+7WNba9VhS9wKxQ=; b=tUCn5hxEhCSx0EmOe+L9bt1JeMI0Mube0WwHwxNrsX3M6NYeXsylPW3k/9xLRsjVZX 0LRvVY45ne4UwWJvrXjfL1+fU9n6DRFwWFeFPwxq0n8Mq4tnOe6b+mHLhJ6qtkVaZ3FO pn9XDmFEF2M/cqiiMC90LMcvE3BgeTP8h5V28= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=MvzMibrBPKjkzojH4se7tAEHXsewLXoHbtZ4r2FwJfL+GGIPoTQz2WZXntGDmg3iC+ hDliVZEHFsqiZkohOWBj/dSkvYztoS8zsShv5U0kFTmnZqcfqxAqs90BwQNZF9czsbWr sg3mMvmrmOel8VCyTGDKIBFejt+GcfYaatauo= MIME-Version: 1.0 Received: by 10.216.87.131 with SMTP id y3mr2163448wee.9.1259083310213; Tue, 24 Nov 2009 09:21:50 -0800 (PST) In-Reply-To: <1259083154.2351.12.camel@btoddb-laptop> References: <1259023124.2351.11.camel@btoddb-laptop> <1259083154.2351.12.camel@btoddb-laptop> From: Jonathan Ellis Date: Tue, 24 Nov 2009 11:21:30 -0600 Message-ID: Subject: Re: ring state out of sync in build 883477 To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Looks like this is another symptom of https://issues.apache.org/jira/browse/CASSANDRA-150, which is on track to be fixed soon On Tue, Nov 24, 2009 at 11:19 AM, B. Todd Burruss wrote= : > they all were restarted at various times. > > for vmguest85 the other three are seed nodes. > > > On Mon, 2009-11-23 at 19:21 -0600, Jonathan Ellis wrote: >> So vmquest85 was restarted, but gen-app02 hasn't told it that there >> are 2 other nodes that are down? >> >> Which one is the seed node? >> >> On Mon, Nov 23, 2009 at 6:38 PM, B. Todd Burruss wro= te: >> > i'm observing the following on a cluster that started with 4 nodes. = =A0i have >> > been killing and restarting the various nodes as i test cassandra and = now >> > i'm seeing a lot of NotFoundException exceptions in the client because= what >> > i believe is ring state out of sync between the two nodes that are sti= ll up >> > and available. =A0The first ring state shown below reflects the curren= t state >> > of the cluster. =A0Also I have seen similar issues when one of the nod= es >> > thinks another node is still available when in fact it has been killed= . =A0it >> > seems to be related to bringing up, killing nodes too fast and not let= ting >> > them figure out when a node is "dead". =A0in this case i see TimedOutE= xception >> > related to NIO SocketChannel class. >> > >> > thx! >> > >> > [cassandra.883477]$ bin/nodeprobe -host gen-app02.dev.real.com -port 8= 080 >> > ring >> > Address =A0 =A0 =A0 Status =A0 =A0 Load >> > Range =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0Ring >> > >> > 144038903974614862325597275257769797985 >> > 172.27.128.186Down =A0 =A0 =A0 22.17 MB >> > 31124469348629903091013930339840898757 =A0 =A0 |<--| >> > 172.27.128.23 Down =A0 =A0 =A0 22.17 MB >> > 64378740291415296162944450043143967518 =A0 =A0 | =A0 | >> > 172.27.128.22 Up =A0 =A0 =A0 =A0 22.17 MB >> > 121134220722269938669001112695509564769 =A0 =A0| =A0 | >> > 172.27.128.185Up =A0 =A0 =A0 =A0 14.69 MB >> > 144038903974614862325597275257769797985 =A0 =A0|-->| >> > >> > [cassandra.883477]$ bin/nodeprobe -host vmguest85.prognet.com -port 80= 80 >> > ring >> > Address =A0 =A0 =A0 Status =A0 =A0 Load >> > Range =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0Ring >> > >> > 144038903974614862325597275257769797985 >> > 172.27.128.22 Up =A0 =A0 =A0 =A0 22.17 MB >> > 121134220722269938669001112695509564769 =A0 =A0|<--| >> > 172.27.128.185Up =A0 =A0 =A0 =A0 14.69 MB >> > 144038903974614862325597275257769797985 =A0 =A0|-->| >> > [cassandra.883477]$ >> > >> > >> > > > >