cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Deng (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-12615) Improve LCS compaction concurrency during L0->L1 compaction
Date Tue, 06 Sep 2016 05:06:20 GMT


Wei Deng updated CASSANDRA-12615:
    Labels: lcs  (was: )

> Improve LCS compaction concurrency during L0->L1 compaction
> -----------------------------------------------------------
>                 Key: CASSANDRA-12615
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Wei Deng
>              Labels: lcs
>         Attachments: L0_L1_inefficiency.jpg
> I've done multiple experiments with {{compaction-stress}} at 100GB, 200GB, 400GB and
600GB levels. These scenarios share a common pattern: at the beginning of the compaction they
all overwhelm L0 with a lot (hundreds to thousands) of 128MB SSTables. One common observation
I noticed from visualizing the compaction.log files from these tests is that initially after
some massive STCS-in-L0 activities (could take up to 40% of total compaction time), L0->L1
always takes a really long time which frequently involves all of the bigger L0 SSTables (the
results of many STCS compactions earlier) and all of the 10 L1 SSTables, and the output covers
almost the full data set. Since L0->L1 can only happen single-threaded, we often spend
close to 40% of the total compaction time in this L0->L1 stage, and only after this first
really long L0->L1 finishes and 100s or 1000s of SSTables land on L1, can concurrent compactions
at higher levels resume (to move the thousands of L1 SSTables to higher levels). The attached
snapshot demonstrates this observation.
> The question is, if this L0->L1 compaction is so big and can only happen single-threaded,
and ends up generating thousands of L1 SSTables, most of which will have to up-level later
anyway (as L1 can accommodate at most 10 SSTables), why not start that L1+ up-level earlier,
i.e. before this L0->L1 compaction finishes.
> I can think of a few approaches: 1) break L0->L1 into smaller chunks if it realizes
that the output of such L0->L1 compaction is going to far exceed the capacity of L1, this
will allow each L0->L1 to finish sooner, and have the resulting L1 SSTables to be able
to participate in higher up-level activities; 2) still treating the full L0->L1 as one
big compaction session, but making the intermediate results (once the number of L1 SSTable
output exceeds the L1 capacity) available for higher up-level activities.
> If we can somehow leverage more threads during this massive L0->L1 phase, we can save
close to 40% of the total compaction time when L0 is initially backlogged, which will be a
great improvement to our LCS compaction throughput.

This message was sent by Atlassian JIRA

View raw message