incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: RuntimeException during leveled compaction
Date Sat, 16 Feb 2013 17:33:26 GMT
That sounds like something wrong with the way the rows are merged during compaction then. 

Can you run the compaction with DEBUG logging and raise a ticket? You may want to do this
with the node not in the ring. Five minutes after it starts it will run pending compactions,
so if there if compactions are not running they should start again. 

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/02/2013, at 8:11 PM, Andre Sprenger <andre.sprenger@getanet.de> wrote:

> 
> Aaron,
> 
> thanks for your help. 
> 
> I ran 'nodetool scrub' and it finished after a couple of hours. But there are no infos
about 
> out of order rows in the logs and the compaction on the column family still raises the
same
> exception. 
> 
> With the row key I could identify some of the errant SSTables and removed them during
> a node restart. On some nodes compaction is working for the moment but there are likely
> more corrupt datafiles and than I would be in the same situation as before.
> 
> So I still need some help to resolve this issue!
> 
> Cheers
> Andre
> 
> 
> 2013/2/12 aaron morton <aaron@thelastpickle.com>
> snapshot all nodes so you have a backup: nodetool snapshot -t corrupt
> 
> run nodetool scrub on the errant CF. 
> 
> Look for messages such as:
> 
> "Out of order row detected…"
> "%d out of order rows found while scrubbing %s; Those have been written (in order) to
a new sstable (%s)"
> 
> In the logs. 
> 
> Cheers
>   
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 12/02/2013, at 6:13 AM, Andre Sprenger <andre.sprenger@getanet.de> wrote:
> 
>> Hi,
>> 
>> I'm running a 6 node Cassandra 1.1.5 cluster on EC2. We have switched to leveled
compaction a couple of weeks ago, 
>> this has been successful. Some days ago 3 of the nodes start to log the following
exception during compaction of 
>> a particular column family:
>> 
>> ERROR [CompactionExecutor:726] 2013-02-11 13:02:26,582 AbstractCassandraDaemon.java
(line 135) Exception in thread Thread[CompactionExecutor:726,1,main]
>> java.lang.RuntimeException: Last written key DecoratedKey(84590743047470232854915142878708713938,
31333535333333383530323237303130313030303232313537303030303132393832) 
>> >= current key DecoratedKey(28357704665244162161305918843747894551, 31333430313336313830333831303130313030303230313632303030303036363338)

>> writing into /var/cassandra/data/AdServer/EventHistory/Adserver-EventHistory-tmp-he-68638-Data.db
>>         at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
>>         at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
>>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
>>         at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
>>         at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
>>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:662)
>> 
>> Compaction does not happen any more for the column family and read performance gets
worse because of the growing 
>> number of data files accessed during reads. Looks like one or more of the data files
are corrupt and have keys
>> that are stored out of order.
>> 
>> Any help to resolve this situation would be greatly appreciated.
>> 
>> Thanks
>> Andre
>> 
> 
> 


Mime
View raw message