cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hendry (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-2084) Corrupt sstables cause compaction to fail again, and again and again, ...
Date Mon, 31 Jan 2011 20:08:29 GMT
Corrupt sstables cause compaction to fail again, and again and again, ...
-------------------------------------------------------------------------

                 Key: CASSANDRA-2084
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2084
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7.0
         Environment: Ubuntu 10.10
Cassandra 0.7.0
4 Nodes
            Reporter: Dan Hendry


I have been having some serious data corruption issues in my cluster. I suspect some deeper
more serious Cassandra bug but I dont know what or where it is and I have not found a way
to reproduce the issues I have been having. 

This ticket is for a behaviour I have observed where cassandra starts compacting a set of
sstables, fails, does not clean up the tmp files, then start compacting the exact same set
of sstables again. (See logs below). After awhile, the node runs out of disk space and crashes.
At the very least, cassandra should clean up temp files after a failed compaction. Better
yet, it should stop trying to compact that file and log what file the error occurred for.
The list of corrupt sstables does not even have to be persistent, just an in memory list which
gets wiped out on a restart.

Here is a sample log, the same 4 sstables are being compacted then failing then being compacted
again. 

 INFO [CompactionExecutor:1] 2011-01-31 13:08:26,434 CompactionManager.java (line 272) Compacting
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,878 HintedHandOffManager.java (line 226) Could
not complete hinted handoff to /192.168.4.16
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 648) switching
in a fresh Memtable for HintsColumnFamily at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500864696.log',
position=104140211)
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 952) Enqueuing
flush of Memtable-HintsColumnFamily@1652350488(1155546 bytes, 20839 operations)
 INFO [FlushWriter:1] 2011-01-31 13:08:28,879 Memtable.java (line 155) Writing Memtable-HintsColumnFamily@1652350488(1155546
bytes, 20839 operations)
 INFO [FlushWriter:1] 2011-01-31 13:08:29,199 Memtable.java (line 162) Completed flushing
/var/lib/cassandra/data/system/HintsColumnFamily-e-9-Data.db (1075487 bytes)
 INFO [GossipStage:1] 2011-01-31 13:08:45,508 Gossiper.java (line 569) InetAddress /192.168.4.16
is now UP
 INFO [COMMIT-LOG-WRITER] 2011-01-31 13:08:59,736 CommitLogSegment.java (line 50) Creating
new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1296500939735.log
 INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 648) switching
in a fresh Memtable for UserSearch at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500939735.log',
position=56028937)
 INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 952) Enqueuing
flush of Memtable-UserSearch@1186863256(174163962 bytes, 2097155 operations)
 INFO [FlushWriter:1] 2011-01-31 13:09:15,868 Memtable.java (line 155) Writing Memtable-UserSearch@1186863256(174163962
bytes, 2097155 operations)
ERROR [CompactionExecutor:1] 2011-01-31 13:09:22,462 AbstractCassandraDaemon.java (line 91)
Fatal exception in thread Thread[CompactionExecutor:1,1,main]
java.io.IOError: java.io.EOFException: attempted to skip 776104308 bytes but only skipped
8469212
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:78)
        at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
        at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
        at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)
        at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
        at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
        at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
        at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
        at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
        at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
        at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
        at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: attempted to skip 776104308 bytes but only skipped 8469212
        at org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)
        ... 20 more
 INFO [CompactionExecutor:1] 2011-01-31 13:09:22,463 CompactionManager.java (line 272) Compacting
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]

 INFO [FlushWriter:1] 2011-01-31 13:09:29,010 Memtable.java (line 162) Completed flushing
/var/lib/cassandra/data/kikmetrics/UserSearch-e-1264-Data.db (184687455 bytes)
 INFO [COMMIT-LOG-WRITER] 2011-01-31 13:09:38,221 CommitLogSegment.java (line 50) Creating
new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1296500978221.log
 INFO [COMMIT-LOG-WRITER] 2011-01-31 13:10:15,781 CommitLogSegment.java (line 50) Creating
new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1296501015781.log
ERROR [CompactionExecutor:1] 2011-01-31 13:10:29,139 AbstractCassandraDaemon.java (line 91)
Fatal exception in thread Thread[CompactionExecutor:1,1,main]
java.io.IOError: java.io.EOFException: attempted to skip 776104308 bytes but only skipped
8469212
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:78)
        at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
        at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
        at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)
        at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
        at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
        at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
        at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
        at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
        at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
        at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
        at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
        at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: attempted to skip 776104308 bytes but only skipped 8469212
        at org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:52)
        at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)
        ... 20 more
 INFO [CompactionExecutor:1] 2011-01-31 13:10:29,148 CompactionManager.java (line 272) Compacting
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message