Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 41915 invoked from network); 26 Aug 2010 03:58:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Aug 2010 03:58:16 -0000 Received: (qmail 18123 invoked by uid 500); 26 Aug 2010 03:58:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 17950 invoked by uid 500); 26 Aug 2010 03:58:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 17942 invoked by uid 99); 26 Aug 2010 03:58:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 03:58:10 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 03:58:04 +0000 Received: by vws10 with SMTP id 10so1458871vws.31 for ; Wed, 25 Aug 2010 20:57:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=nYp6XPM1qVd7hYzUckUH37VCszwYAGhvEVodfO46zEk=; b=BhHr8oPuGQamc3VqHG417gZXBxZK694lyEqeMfcPwElHAF4+iUkAvpnVgBzVA/9o6J EvOzaYlTStAu7kPCNq1WXh8modKLCHNg7ToQzJHWOWtWn9CJyqfqGhLG3Gp73vPPNg/n bitOAoMMKp+aCRLiTrjGkWmu3aQnfdNqC5gLw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=RKJAWoJLRvIk8UBp/jVRmaJTXADU2u0/E7ifkyglmNp26leSMMKX6+8GhD200kZ2/3 yhRI4h10jOzFtcyJWdiweZDDpKT2qfg36xF7LZJzxHfPEC4urF6/d+IPO6GFUTrt4osw tDRSNhuxK/r35GE/F161cFvCGXIVMp6wpnIuQ= Received: by 10.220.61.140 with SMTP id t12mr5860734vch.54.1282795063261; Wed, 25 Aug 2010 20:57:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.103.204 with HTTP; Wed, 25 Aug 2010 20:57:23 -0700 (PDT) In-Reply-To: <210a5dbf-8c48-66fc-ac2d-9921e295dfdc@me.com> References: <210a5dbf-8c48-66fc-ac2d-9921e295dfdc@me.com> From: Jonathan Ellis Date: Wed, 25 Aug 2010 22:57:23 -0500 Message-ID: Subject: Re: 0.7.0.bet1 errors during start up To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Yes, please open a ticket for the assertion error. (Once JIRA is back up..= .) On Wed, Aug 25, 2010 at 10:46 PM, Aaron Morton wr= ote: > yes, starting the others made things a bit happier. > Any thoughts on the assertion error that cause the startup to fail ? I've > seen it a couple of times. > Seems to be from this line in CommitLogHeader.java > 157:=A0=A0=A0 =A0 =A0 =A0 =A0 =A0assert clHeader.cfDirtiedAt.size() <=3D = clHeader.cfCount; > Thanks > Aaron > On 26 Aug, 2010,at 03:25 PM, Jonathan Ellis wrote: > > the one node you restarted thinks it's the only node in the cluster. > starting the others will fix that. > > On Wed, Aug 25, 2010 at 10:10 PM, Aaron Morton > wrote: >> 0.7.0-bet1 4 node clustered, i'd managed to get it into some sort of awf= ul >> state (i think by accidently creating to many clients, it was also >> complaining about running out of file handles).=A0Anyway I killed it all= and >> restarted just one node, thought I would let it settle down then start t= he >> others. On the first node I got this. >> (Sorry I cannot be more specific, was not paying too much attention it i= t >> all went bang) >> I managed to get a couple of errors, one of which shutdown the server. >> Just >> checking before putting them into Jira, should I split them up? >> ERROR [pool-1-thread-29] 2010-08-26 14:58:20,021 Cassandra.java (line >> 2651) >> Internal error processing get_slice >> java.lang.IllegalStateException: replication factor (3) exceeds number o= f >> endpoints (1) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.locator.RackUnawareStrategy.calculateNaturalEndpoin= ts(RackUnawareStrategy.java:57) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpo= ints(AbstractReplicationStrategy.java:88) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageS= ervice.java:1289) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageS= ervice.java:1277) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.StorageService.findSuitableEndpoint(Storage= Service.java:1323) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:4= 02) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java= :302) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraSe= rver.java:125) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.jav= a:231) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thriftCassandraServer.multigetSliceInternal(Cassand= raServer.java:309) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.ja= va:270) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassan= dra.java:2643) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2= 499) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cu= stomTThreadPoolServer.java:167) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto= r.java:886) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja= va:908) >> =A0=A0=A0=A0=A0=A0=A0 at java.lang.Thread.run(Thread.java:619) >> >> So I started the other 3, and two suffered the error below which caused >> the >> process to shutdown... >> ERROR [main] 2010-08-26 14:59:22,315 AbstractCassandraDaemonjava (line >> 107) >> Exception encountered during startup. >> java.lang.RuntimeException: java.util.concurrentExecutionException: >> java.lang.RuntimeException: java.lang.RuntimeException: >> java.util.concurrent.ExecutionException: java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.utils.FBUtilitieswaitOnFutures(FBUtilitiesjava:549) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:339) >> =A0=A0=A0=A0=A0=A0=A0 at >> orgapache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:174) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:1= 20) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCa= ssandraDaemon.java:90) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:22= 4) >> Caused by: java.util.concurrent.ExecutionException: >> java.lang.RuntimeException: java.lang.RuntimeException: >> java.util.concurrent.ExecutionException: java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.FutureTask.get(FutureTask.= java:83) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:54= 5) >> =A0=A0=A0=A0=A0=A0=A0 ... 5 more >> Caused by: java.lang.RuntimeException: java.lang.RuntimeException: >> java.util.concurrent.ExecutionException: java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) >> =A0=A0=A0=A0=A0=A0=A0 at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) >> =A0=A0=A0=A0=A0=A0=A0 at >> java.util.concurrentFutureTask$Sync.innerRun(FutureTask.java:303) >> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.FutureTask.run(FutureTask.= java:138) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto= r.java:886) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja= va:908) >> =A0=A0=A0=A0=A0=A0=A0 at java.lang.Thread.run(Thread.java:619) >> Caused by: java.lang.RuntimeException: >> java.util.concurrent.ExecutionException: java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(Com= mitLog.java:408) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.dbColumnFamilyStore$2.runMayThrow(ColumnFamilyStore= .java:445) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) >> =A0=A0=A0=A0=A0=A0=A0 .. 6 more >> Caused by: java.util.concurrent.ExecutionException: >> java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTaskjava:222) >> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.FutureTask.get(FutureTask.= java:83) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(Com= mitLog.java:400) >> =A0=A0=A0=A0=A0=A0=A0 ... 8 more >> Caused by: java.lang.AssertionError >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLogHeader$CommitLogHeaderSeriali= zer.serialize(CommitLogHeader.java:157) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLogHeader.writeCommitLogHeader(C= ommitLogHeader.java:124) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLogSegment.writeHeader(CommitLog= Segment.java:70) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegmentsInte= rnal(CommitLog.java:450) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.db.commitlog.CommitLog.access$300(CommitLog.java:75= ) >> =A0=A0=A0=A0=A0=A0=A0 at >> org.apache.cassandra.db.commitlog.CommitLog$6.call(CommitLog.java:394) >> =A0=A0=A0=A0=A0=A0=A0 at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.FutureTask.run(FutureTask.= java:138) >> =A0=A0=A0=A0=A0=A0=A0 at >> >> org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.run= MayThrow(PeriodicCommitLogExecutorService.java:52) >> =A0=A0=A0=A0=A0=A0=A0 at >> orgapache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) >> =A0=A0=A0=A0=A0=A0=A0 ... 1 more >> >> Aaron >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com