incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Kuo <colinkuo...@gmail.com>
Subject Re: Advice on how to handle corruption in system/hints
Date Mon, 09 Jun 2014 10:29:46 GMT
Hi Francois,

We're facing the same issue like yours. The approach we did is to

1. scrub that corrupted data file
2. repair that column family

Immediately delete that corrupted files is not suggested if C* instance is
running.
This might be happening if bad disk or power outage.

Thanks,

Colin


<http://about.me/ColinKuo>
Colin Kuo
about.me/ColinKuo
[image: Colin Kuo on about.me]

<http://about.me/ColinKuo>


On Mon, Jun 9, 2014 at 6:11 AM, Francois Richard <frichard@yahoo-inc.com>
wrote:

>  Hi everyone,
>
>  We are running some Cassandra clusters (Usually a cluster of 5 nodes
> with replication factor of 3.)  And at least once per day we do see some
> corruption related to a specific sstable in system/hints. (We are using
> Cassandra version 1.2.16 on RHEL 6.5)
>
>  Here is an example of such exception:
>
>   ERROR [CompactionExecutor:1694] 2014-06-08 21:37:33,267
> CassandraDaemon.java (line 191) Exception in thread
> Thread[CompactionExecutor:1694,1,main]
>
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
> would be larger than file /home/y/var/cassandra/data/syste
>
> m/hints/system-hints-ic-281-Data.db length 504590769
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:167)
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83)
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
>
>         at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:145)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)
>
>         at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>
>         at
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
>
>         at
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>
>         at
> org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442)
>
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.io.IOException: dataSize of 8224262783474088549 starting
> at 502360510 would be larger than file
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
> 504590769
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:123)
>
>         ... 23 more
>
>  INFO [HintedHandoff:35] 2014-06-08 21:37:33,267
> HintedHandOffManager.java (line 296) Started hinted handoff for host:
> 502a48cd-171b-4e83-a9ad-67f32437353a with IP: /10.210.239.190
>
> ERROR [HintedHandoff:33] 2014-06-08 21:37:33,267 CassandraDaemon.java
> (line 191) Exception in thread Thread[HintedHandoff:33,1,main]
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
> would be larger than file
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
> 504590769
>
>         at
> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:441)
>
>         at
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)
>
>         at
> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90)
>
>         at
> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:508)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
> would be larger than file
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
> 504590769
>
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>
>         at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>
>         at
> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:437)
>
>         ... 6 more
>
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: dataSize of 8224262783474088549 starting at 502360510
> would be larger than file
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
> 504590769
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:167)
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83)
>
>         at
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)
>
>         at
> org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
>
>         at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:145)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)
>
>         at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>
>         at
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
>
>         at
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>
>
>
>  Our current filesystem configuration for Cassandra: (nothing fancy …)
>
>
>  /dev/sda6            /home/y/var/cassandra/commitlog ext4
> defaults,commit=20,noatime,nobarrier,nodiratime   0 0
>
> /dev/sda7            /home/y/var/cassandra/data ext4
> defaults,commit=20,data=writeback,noatime,nobarrier,nodiratime   0 0
>
>
>
>  The workaround we have right now is the following:
>
>
>  1-  delete the “guilty” sstable, in this case:
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281*
>
> 2- Issue a major compaction for system/hints —> nodetool compact system
> hints;
>
> 3- Repeat for all the stables producing this issue.
>
>
>
>  My biggest worry here is around the following message:
>
>
>   org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: dataSize of *8224262783474088549* starting at
> 502360510 would be larger than file
> /home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length
> *504590769*
>
>
>
>  Any clues on why this is happening ?
>
>
>
>  Thanks,
>
>
>  FR
>
>
>
>
>
>
>
>

Mime
View raw message