cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CASSANDRA-8190) Compactions stop completely because of RuntimeException in CompactionExecutor
Date Wed, 29 Oct 2014 15:41:34 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcus Eriksson resolved CASSANDRA-8190.
----------------------------------------
    Resolution: Won't Fix

The compactions probably stop since we don't call 'submitBackground()' if the CompactionTask
throws an exception and there are no writes to the node so no new compaction tasks are triggered

This could be 'fixed' by doing submitBackground() in a finally block after the compaction
task is executed, but I think that might be a bad idea since we could end up in weird infinite
loop situations if we keep throwing exceptions in the compaction tasks. And, if we do throw,
the node might need some manual intervention anyway

Created CASSANDRA-8211 for the cause of the exception in this issue

> Compactions stop completely because of RuntimeException in CompactionExecutor
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8190
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8190
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: DSE 4.5.2 (Cassandra 2.0.10)
>            Reporter: Nikolai Grigoriev
>            Assignee: Marcus Eriksson
>         Attachments: cassandra-env.sh, cassandra.yaml, jstack.txt.gz, system.log.gz,
system.log.gz
>
>
> I have a cluster that is recovering from being overloaded with writes.  I am using the
workaround from CASSANDRA-6621 to prevent the STCS fallback (which is killing the cluster
- see CASSANDRA-7949). 
> I have observed that after one or more exceptions like this
> {code}
> ERROR [CompactionExecutor:4087] 2014-10-26 22:50:05,016 CassandraDaemon.java (line 199)
Exception in thread Thread[CompactionExecutor:4087,1,main]
> java.lang.RuntimeException: Last written key DecoratedKey(425124616570337476, 0010000000001111000000000000033523da00001000000000033523da000000001111000000001000000000
> 00004000000000000000000100) >= current key DecoratedKey(-8778432288598355336, 0010000000001111000000000000040c7a8f00001000000000040c7a8f000000001111000000001000000000
> 00004000000000000000000100) writing into /cassandra-data/disk2/myks/mytable/myks-mytable-tmp-jb-130379-Data.db
>         at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:142)
>         at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:165)
>         at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>         at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>         at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>         at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> the node completely stops the compactions and I end up in the state like this:
> {code}
> # nodetool compactionstats
> pending tasks: 1288
>           compaction type        keyspace           table       completed           total
     unit  progress
> Active compaction remaining time :        n/a
> {code}
> The node recovers if restarted and starts compactions - until getting more exceptions
like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message