cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12763) Compaction performance issues when a table has a lot of sstables
Date Sun, 09 Oct 2016 23:07:20 GMT


Jeff Jirsa commented on CASSANDRA-12763:

As mentioned in IRC, I *THINK* that ultimately the problem is that this loop:

Calls {{listFiles()}} for each sstablereader in the transaction. Since {{listFiles}} is notoriously
slow when there are a ton of files in the directory, and you call it N times, you end up waiting
at the end of the compaction preparing to commit/marking files for deletion. It may be worth
someone investigating whether or not we can avoid the full directory scan N times - due to
nature of that code, it may not be safe to cache the directory listing, but it's worth looking

> Compaction performance issues when a table has a lot of sstables
> ----------------------------------------------------------------
>                 Key: CASSANDRA-12763
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Tom van der Woerdt
> An issue with a script flooded my cluster with sstables. There is now a table with 100k
sstables, all on the order of KBytes, and it's taking a long time (ETA 20 days) to compact,
even though the table is only ~30GB.
> Stack trace :
> {noformat}
> "CompactionExecutor:269" #7536 daemon prio=1 os_prio=4 tid=0x00007f4acd40fc00 nid=0x14f8
runnable [0x00007f4798436000]
>    java.lang.Thread.State: RUNNABLE
> 	at Method)
> 	at
> 	at
> 	at org.apache.cassandra.db.lifecycle.LogRecord.getExistingFiles(
> 	at org.apache.cassandra.db.lifecycle.LogRecord.make(
> 	at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(
> 	at org.apache.cassandra.db.lifecycle.LogFile.add(
> 	at org.apache.cassandra.db.lifecycle.LogTransaction.obsoleted(
> 	at org.apache.cassandra.db.lifecycle.Helpers.prepareForObsoletion(
> 	at org.apache.cassandra.db.lifecycle.LifecycleTransaction.doPrepare(
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(
> 	at
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(
> 	at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(
> 	at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(
> 	at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(
> 	at
> 	at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(
> 	at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(
> 	at org.apache.cassandra.db.compaction.CompactionManager$
> 	at java.util.concurrent.Executors$
> 	at
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at
> {noformat}
> listFiles is being called over and over, apparently scaling with the number of files
in the compaction.

This message was sent by Atlassian JIRA

View raw message