cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom van der Woerdt (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12764) Compaction performance issues with many sstables, during transaction commit phase
Date Mon, 10 Oct 2016 20:12:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563347#comment-15563347
] 

Tom van der Woerdt commented on CASSANDRA-12764:
------------------------------------------------

Oh, nice to finally link the IRC name to the Jira name :)

Yes, it was a lot faster. Here's a graph showing what happened the last four days: https://i.imgur.com/AdWCCrR.png
(graphing inode usage, divide by 8 for sstable count)

The red line is the node that started the mess. A botched repair[1] caused a nice 100k sstables.
This was noticed, and cleaned up.

Sadly it had already synced those 100k sstables to other nodes, which properly started compacting
the large amounts of files away. But then the regular automation jobs started a repair on
the node I wiped, streaming all the files all over the place :( Sadly I was unaware of this
until it was too late, and suddenly a lot of nodes on the cluster had 100k sstables :)

The sstable count was slowly going down (very, very slowly) but I figured I'd hop on IRC where
[~jjirsa] and [~brandon.williams] helped find a workaround (the table move). I applied it
to the most broken node first. On the graph it's the red line, look for the slope at the 10/10
boundary. This morning my script broke and it did the final sstables the slow route, but it
finished and as you can see the scripted version is much faster than just letting compaction
run. I'm in the progress of applying it to the two most broken nodes now, and will let the
others just finish.

Anyway, that's the story of how this happened, which was totally my fault :) Now I'm just
hoping that my mistake can lead to improvements in compaction performance.

Tom


[1]: subrange repair (similar to BrianGallew's code) on a LCS table, with 256 vnodes, and
most data not passing validation.

> Compaction performance issues with many sstables, during transaction commit phase
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12764
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12764
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Tom van der Woerdt
>              Labels: lcs
>
> An issue with a script flooded my cluster with sstables. There is now a table with 100k
sstables, all on the order of KBytes, and it's taking a long time (ETA 20 days) to compact,
even though the table is only ~30GB.
> Stack trace :
> {noformat}
> "CompactionExecutor:308" #7541 daemon prio=1 os_prio=4 tid=0x00007fa22af35400 nid=0x41eb
runnable [0x00007fdbea48d000]
>    java.lang.Thread.State: RUNNABLE
> 	at java.util.TimSort.countRunAndMakeAscending(TimSort.java:360)
> 	at java.util.TimSort.sort(TimSort.java:220)
> 	at java.util.Arrays.sort(Arrays.java:1438)
> 	at com.google.common.collect.Ordering.sortedCopy(Ordering.java:817)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:209)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:211)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:210)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:210)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:210)
> 	at org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:210)
> 	at org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:50)
> 	at org.apache.cassandra.db.lifecycle.SSTableIntervalTree.<init>(SSTableIntervalTree.java:40)
> 	at org.apache.cassandra.db.lifecycle.SSTableIntervalTree.build(SSTableIntervalTree.java:50)
> 	at org.apache.cassandra.db.lifecycle.View$4.apply(View.java:288)
> 	at org.apache.cassandra.db.lifecycle.View$4.apply(View.java:283)
> 	at com.google.common.base.Functions$FunctionComposition.apply(Functions.java:216)
> 	at org.apache.cassandra.db.lifecycle.Tracker.apply(Tracker.java:128)
> 	at org.apache.cassandra.db.lifecycle.Tracker.apply(Tracker.java:101)
> 	at org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkpoint(LifecycleTransaction.java:307)
> 	at org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkpoint(LifecycleTransaction.java:288)
> 	at org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:368)
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
> 	at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:84)
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
> 	at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184)
> 	at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:94)
> 	at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:194)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
> 	at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
> 	at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> IntervalTree shows in a lot of stack traces I've taken on several nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message