cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Jurik (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-6366) Corrupt SSTables
Date Sun, 17 Nov 2013 08:33:23 GMT
Matt Jurik created CASSANDRA-6366:

             Summary: Corrupt SSTables
                 Key: CASSANDRA-6366
             Project: Cassandra
          Issue Type: Bug
         Environment: 1.2.10
            Reporter: Matt Jurik

We ran into some corrupt sstables on one of our 8-node clusters running 1.2.10 (since upgraded
to 1.2.11). Initially, we saw one corrupt sstable on a single node. After doing a "nodetool
scrub" and then a "nodetool -pr repair" for the cluster, we were left with 2 nodes reporting
3 corrupt sstables.

All nodes appear healthy; fsck and our raid controllers report no issues. The sstables were
written out during normal operation; there were no machine restarts or failures anywhere near
the sstable file timestamps.

Curiously, I figured out how to read all 3 of our corrupt sstables, though I have no idea
why this works. Additionally, it seems that I'm able to read all OnDiskAtoms as specified
in the row header, so the data seems intact.

diff --git a/src/java/org/apache/cassandra/io/sstable/ b/src/java/org/apache/cassandra/io/sstable/
index 381fdb9..8fce5f7 100644
--- a/src/java/org/apache/cassandra/io/sstable/
+++ b/src/java/org/apache/cassandra/io/sstable/
@@ -180,6 +180,11 @@ public class SSTableIdentityIterator implements Comparable<SSTableIdentityIterat
     public boolean hasNext()
+         /*
+         * For each row where corruption is reported, it is the case that we read more data
from the preceeding row
+         * than specified by dataSize. That is, this iterator will terminate with:
+         *     inputWithTracker.getBytesRead() > dataSize
+         */
         return inputWithTracker.getBytesRead() < dataSize;
diff --git a/src/java/org/apache/cassandra/io/sstable/ b/src/java/org/apache/cassandra/io/sstable/
index 1df5842..718324c 100644
--- a/src/java/org/apache/cassandra/io/sstable/
+++ b/src/java/org/apache/cassandra/io/sstable/
@@ -167,8 +167,9 @@ public class SSTableScanner implements ICompactionScanner
-                if (row != null)
-          ;
+                // Magically read corrupt sstables...
+                // if (row != null)
+                //;
                 assert !dfile.isEOF();
                 // Read data header
diff --git a/src/java/org/apache/cassandra/tools/ b/src/java/org/apache/cassandra/tools/
index 05fe9f6..ed61010 100644
--- a/src/java/org/apache/cassandra/tools/
+++ b/src/java/org/apache/cassandra/tools/
@@ -432,7 +432,7 @@ public class SSTableExport
     public static void export(Descriptor desc, String[] excludes) throws IOException
-        export(desc, System.out, excludes);
+        export(desc, new PrintStream("json"), excludes);

Otherwise, I get a stacktrace such as:

{code} dataSize of
72339146324312065 starting at 80476328 would be larger than file /Users/exabytes18/development/yay/corrupt-sstables/corrupt-files3/my_keyspace-my_table-ic-40693-Data.db
length 109073657
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at com.intellij.rt.execution.application.AppMain.main(
Caused by: dataSize of 72339146324312065 starting at 80476328 would be
larger than file /Users/exabytes18/development/yay/corrupt-sstables/corrupt-files3/my_keyspace-my_table-ic-40693-Data.db
length 109073657
    ... 14 more

Any help on the matter is appreciated.

This message was sent by Atlassian JIRA

View raw message