cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruoran Wang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-11599) When there are a large number of small repaired L0 sstables, compaction is very slow
Date Mon, 18 Apr 2016 20:59:25 GMT
Ruoran Wang created CASSANDRA-11599:
---------------------------------------

             Summary: When there are a large number of small repaired L0 sstables, compaction
is very slow
                 Key: CASSANDRA-11599
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11599
             Project: Cassandra
          Issue Type: Bug
         Environment: 2.1.13
            Reporter: Ruoran Wang


This is on 6 node 2.1.13 cluster with leveled compaction strategy.
 
Initially, I found missing metrics when there is heavy compaction going on. Because WrappingCompactionStrategy
is blocked. 
Then I saw a case where compaction got stucked (progress moves dramatically slow). There are
29k sstables after inc repair where I noticed tons of sstables are only 200+ Bytes just containing
1 key. Also because of WrappingCompactionStrategy is blocked.
 
My guess is, with 8 compaction_executors and a tons of small repaired L0 sstables, the first
thread is able to get some (likely 32) sstables to compact. If this task contains a large
range of tokens, the following 7 thread will iterate through the sstabels trying to find what
can be fixed in the meanwhile, but failing in the end, since those sstable candidates intersects
with what is being compacted by 1st thread. From a series of thread dump, I noticed the thread
that is doing work always get blocked by other 7 threads.
 
1. I tried to separate an inc-repair into 4 token ranges, which helped keeping the sstables
count down. That seems to be working.

2. Another fix I tried is, replace ageSortedSSTables with a new method "keyCountSortedSSTables",
which small sstables are returned first. (at org/apache/cassandra/db/compaction/LeveledManifest.java:586).
Since there will be 32 very small sstables, the following condition won't be met ('SSTableReader.getTotalBytes(candidates)
> maxSSTableSizeInBytes'), and the compaction will merge those 32 very small sstables.
This will help to prevent the first compaction job to be working on a set of sstables that
covers a wide range.

I can provide more info if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message