cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharvanath Pathak (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10479) Handling partially written sstables on node crashes
Date Fri, 09 Oct 2015 08:29:26 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950066#comment-14950066
] 

Sharvanath Pathak commented on CASSANDRA-10479:
-----------------------------------------------

Deleting the corrupt SStables, in general, is bad, and I do agree that in your example I wouldn't
be happy if the SStable was automatically deleted. However, one approach is to maintain a
persistent list of which SStables are being flushed, and remove it from that list before marking
the corresponding commitlogs as non-dirty for that column family. In this case it is perfectly
valid to delete these SStables on bootup since all that data is present in commitlogs. Running
Cassandra without this feature will require manual intervention for most cases of node crashes,
and that would be pretty bad for any system that claims it to be tolerant to node crashes.

> Handling partially written sstables on node crashes
> ---------------------------------------------------
>
>                 Key: CASSANDRA-10479
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10479
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sharvanath Pathak
>
> Currently a power loss can potentially require manual intervention to bring Cassandra
back up. Essentially, these partially written SStables are considered as corrupt, and we see
the following trace quite often on hard reboots:
> {noformat}
> INFO  [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368
(79 bytes)
> ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 - Exiting forcefully
due to file system exception on startup, disk failure policy "stop"
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
>         at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_80]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_80]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_80]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_80]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.io.EOFException: null
>         at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_80]
>         at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_80]
>         at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_80]
>         at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         ... 14 common frames omitted
> {noformat}
> Deleting partially written SStables might be a perfectly valid thing to do (given that
the data is present in commitlogs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message