cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-4310) Multiple independent Level Compactions in Parallel
Date Mon, 08 Oct 2012 16:38:03 GMT


Yuki Morishita updated CASSANDRA-4310:

    Attachment: 4310-v3.txt

V3 attached.

bq. Shouldn't "skip submitBackground" logic be "CF is currently compacting, AND no more idle

Yes. Fixed this.

bq. Looks like compactingCF.remove(cfs); could be moved to the topmost finally block of run()


bq. We already track compacting sstables in DataTracker.View; seems like we shouldn't need
to duplicate this in LeveledManifest. (I note that getting this correct in DT took some effort
so I am worried on that level as well as abstract code purity.) If the problem is that getNextBackgroundTask
is called multiple times before CompactionManager officially marks the sstables as in-progress,
my preferred solution would be to move the in-progress code earlier.

I modified the code to mark sstables compacting before returning compaction task so that I
can use DataTracker.View's compacting property. If nothing can be marked, then no task is

I also cleaned up no longer used codes.
> Multiple independent Level Compactions in Parallel
> --------------------------------------------------
>                 Key: CASSANDRA-4310
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: sankalp kohli
>            Assignee: Yuki Morishita
>              Labels: compaction, features, leveled, performance, ssd
>             Fix For: 1.2.1
>         Attachments: 4310.txt, 4310-v2.txt, 4310-v3.txt
> Problem: If you are inserting data into cassandra and level compaction cannot catchup,
you will create lot of files in L0.  
> Here is a solution which will help here and also increase the performance of level compaction.
> We can do many compactions in parallel for unrelated data.
> 1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions
in other levels like L2 and L3 if they are eligible.
> 2) We can also do compactions with files in L1 which are not participating in L0 compactions.
> This is specially useful if you are using SSD and is not bottlenecked by IO. 
> I am seeing this issue in my cluster. The compactions pending are more than 50k and the
disk usage is not that much(I am using SSD).
> I am doing multithreaded to true and also not throttling the IO by putting the value
as 0. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message