cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11750) Offline scrub should not abort when it hits corruption
Date Thu, 19 May 2016 13:24:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291084#comment-15291084
] 

Yuki Morishita commented on CASSANDRA-11750:
--------------------------------------------

||branch||testall||dtest||
|[11750-2.1|https://github.com/yukim/cassandra/tree/11750-2.1]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11750-2.1-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11750-2.1-dtest/lastCompletedBuild/testReport/]|
|[11750-2.2|https://github.com/yukim/cassandra/tree/11750-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11750-2.2-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11750-2.2-dtest/lastCompletedBuild/testReport/]|

Backported CASSANDRA-11578 to 2.1/2.2. This ables tools to skip disk failure policy check.

> Offline scrub should not abort when it hits corruption
> ------------------------------------------------------
>
>                 Key: CASSANDRA-11750
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11750
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Adam Hattrell
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: Tools
>             Fix For: 2.1.x, 2.2.x
>
>
> Hit a failure on startup due to corruption of some sstables in system keyspace.  Deleted
the listed file and restarted - came down again with another file.
> Figured that I may as well run scrub to clean up all the files.  Got following error:
> {noformat}
> sstablescrub system compaction_history 
> ERROR 17:21:34 Exiting forcefully due to file system exception on startup, disk failure
policy "stop" 
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-1936-CompressionInfo.db

> at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]

> at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]

> at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
> at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]

> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79]

> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_79] 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79]

> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79]

> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 
> Caused by: java.io.EOFException: null 
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_79]

> at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_79] 
> at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_79] 
> at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106)
~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
> ... 14 common frames omitted 
> {noformat}
> I guess it might be by design - but I'd argue that I should at least have the option
to continue and let it do it's thing.  I'd prefer that sstablescrub ignored the disk failure
policy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message