Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 38E0C18FF0 for ; Sun, 14 Feb 2016 22:10:19 +0000 (UTC) Received: (qmail 47742 invoked by uid 500); 14 Feb 2016 22:10:19 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 47706 invoked by uid 500); 14 Feb 2016 22:10:19 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 47462 invoked by uid 99); 14 Feb 2016 22:10:19 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Feb 2016 22:10:19 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CE1622C14F4 for ; Sun, 14 Feb 2016 22:10:18 +0000 (UTC) Date: Sun, 14 Feb 2016 22:10:18 +0000 (UTC) From: "Samu Kallio (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-11158) AssertionError: null in Slice$Bound.create MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-11158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146748#comment-15146748 ] Samu Kallio edited comment on CASSANDRA-11158 at 2/14/16 10:09 PM: ------------------------------------------------------------------- I ended up having to delete the corrupted SSTable files on all nodes as nothing could touch them without crashing. Everything seemed OK for a while, then one of the nodes wrote a corrupt SSTable again. Repairing the cluster is not possible right now because the merkle tree calculation crashes. Also, 2 of the 3 nodes seem to be running hinted handoff every 10 seconds constantly, even though the whole cluster is up and has seen no network disruptions since the one that triggered this issue: {noformat} INFO 22:05:51 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487534177-1.hints INFO 22:05:51 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487534177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c INFO 22:06:01 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487544177-1.hints INFO 22:06:01 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487544177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c INFO 22:06:11 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487554177-1.hints INFO 22:06:11 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487554177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c {noformat} was (Author: samukallio): I ended up having to delete the corrupted SSTable files on all nodes as nothing could touch them without crashing. Everything seemed OK after a while, then one of the nodes wrote a corrupt SSTable again. Repairing the cluster is not possible right now because the merkle tree calculation crashes. Also, 2 of the 3 nodes seem to be running hinted handoff every 10 seconds constantly, even though the whole cluster is up and has seen no network disruptions since the one that triggered this issue: {noformat} INFO 22:05:51 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487534177-1.hints INFO 22:05:51 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487534177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c INFO 22:06:01 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487544177-1.hints INFO 22:06:01 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487544177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c INFO 22:06:11 Deleted hint file 97341307-d380-4808-89be-e04393bc0a5c-1455487554177-1.hints INFO 22:06:11 Finished hinted handoff of file 97341307-d380-4808-89be-e04393bc0a5c-1455487554177-1.hints to endpoint 97341307-d380-4808-89be-e04393bc0a5c {noformat} > AssertionError: null in Slice$Bound.create > ------------------------------------------ > > Key: CASSANDRA-11158 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11158 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Reporter: Samu Kallio > > We've been running Cassandra 3.0.2 for around a week now. Yesterday, we had a network event that briefly isolated one node from others in a 3 node cluster. Since then, we've been seeing a constant stream of "Finished hinted handoff" messages, as well as: > {noformat} > WARN 16:34:39 Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {} > java.lang.AssertionError: null > at org.apache.cassandra.db.Slice$Bound.create(Slice.java:365) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.Slice$Bound$Serializer.deserializeValues(Slice.java:553) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:274) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:115) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:107) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:149) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:216) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1568) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.columniterator.SSTableIterator.(SSTableIterator.java:36) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:62) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndSSTablesInTimestampOrder(SinglePartitionReadCommand.java:715) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:482) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:459) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:325) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:350) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:45) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) ~[apache-cassandra-3.0.2.jar:3.0.2] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_72] > at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-3.0.2.jar:3.0.2] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.0.2.jar:3.0.2] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > {noformat} > and also > {noformat} > ERROR 06:10:11 Exception in thread Thread[CompactionExecutor:1,1,main] > java.lang.AssertionError: null > at org.apache.cassandra.db.Slice$Bound.create(Slice.java:365) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.Slice$Bound$Serializer.deserializeValues(Slice.java:553) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:274) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:115) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:107) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:149) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator.computeNext(BigTableScanner.java:305) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator.computeNext(BigTableScanner.java:260) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.io.sstable.format.big.BigTableScanner.hasNext(BigTableScanner.java:240) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.hasNext(UnfilteredPartitionIterators.java:150) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) ~[apache-cassandra-3.0.3.jar:3.0.3] > at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) ~[apache-cassandra-3.0.3.jar:3.0.3] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_72] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_72] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_72] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > {noformat} > on all 3 nodes. I'm now upgrading the nodes to 3.0.3, but the issue seems to persist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)