cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4310) Multiple independent Level Compactions in Parallel
Date Wed, 10 Oct 2012 17:01:04 GMT


Yuki Morishita commented on CASSANDRA-4310:

v5 simplifies a lot. :)

There is one thing we need to fix.
The test I added to v2 (LCSTest#testParallelLeveledCompaction) should generate more SSTables
first in order to run compactions in parallel(right now it's 20, but that is not enough to
fill compaction threads).

When I run that test alone with 128 SSTabbles(so there are 128 sstables on L0 only), leveled
compaction produces following error.

12/10/10 10:23:12 ERROR compaction.LeveledManifest: At level 1, SSTableReader(path='build/test/cassandra/data/Keyspace1/StandardLeveled/Keyspace1-StandardLeveled-ia-186-Data.db')
[DecoratedKey(Token(bytes[3735]), 3735), DecoratedKey(Token(bytes[3738]), 3738)] overlaps
[DecoratedKey(Token(bytes[3737]), 3737), DecoratedKey(Token(bytes[3830]), 3830)].  This is
caused by a bug in Cassandra 1.1.0 .. 1.1.3.  Sending back to L0.  If you have not yet run
scrub, you should do so since you may also have rows out-of-order within an sstable

So, we have to check candidates against compacting L0 sstables for possible generation of
overlapping sstables in L1.
Pushed fix to github:
Diff from v5:
> Multiple independent Level Compactions in Parallel
> --------------------------------------------------
>                 Key: CASSANDRA-4310
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: sankalp kohli
>            Assignee: Yuki Morishita
>              Labels: compaction, features, leveled, performance, ssd
>             Fix For: 1.2.1
>         Attachments: 4310.txt, 4310-v2.txt, 4310-v3.txt, 4310-v5.txt
> Problem: If you are inserting data into cassandra and level compaction cannot catchup,
you will create lot of files in L0.  
> Here is a solution which will help here and also increase the performance of level compaction.
> We can do many compactions in parallel for unrelated data.
> 1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions
in other levels like L2 and L3 if they are eligible.
> 2) We can also do compactions with files in L1 which are not participating in L0 compactions.
> This is specially useful if you are using SSD and is not bottlenecked by IO. 
> I am seeing this issue in my cluster. The compactions pending are more than 50k and the
disk usage is not that much(I am using SSD).
> I am doing multithreaded to true and also not throttling the IO by putting the value
as 0. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message