cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Germán Kondolf (JIRA) <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1876) Allow minor Parallel Compaction
Date Mon, 20 Dec 2010 12:59:02 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973186#action_12973186
] 

Germán Kondolf commented on CASSANDRA-1876:
-------------------------------------------

I didn't have in mind that those new SSTables would be re-compacted again so soon. That's
if the sstable count still being high, if it gets below the minimum it won't be a real problem,
in fact I think that with this patch we should raise the minimumCompactionThreshold.

In the other hand, maybe the approach I've commented on the mail-thread would be worth to
try.

Link: http://markmail.org/message/d2uh4mu5qnzm456w

Could we just compact the indexes and filters and leave alone the data while we're doing minor
compactions?
It changes the storage structure a bit, there won't be a direct relation between the 3 parts
of the SSTable, but in memory you will reduce the amount of filters to check.

LogicSSTable could have:
- IndexFile
- FilterFile
- SSTableFile[]

And in the IndexFile structure will have also the file corresponding to each row.

The drawback of this, is that expired items won't be removed from the SSTables, but instead
we won't index and filter them, making the memory model efficient while we're receiving new
items.
The major compaction will have to do the real "housekeeping", and this kind of compaction
won't produce LogicSSTables.

> Allow minor Parallel Compaction
> -------------------------------
>
>                 Key: CASSANDRA-1876
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Germán Kondolf
>            Priority: Minor
>         Attachments: 1876-reformatted.txt, compactionPatch-V2.txt
>
>
> Hi,
> According to the dev's list discussion (1) I've patched the CompactionManager to allow
parallel compaction.
> Mainly it splits the sstables to compact in the desired buckets, configured by a new
parameter: compaction_parallelism with the current default of "1".
> Then, it just submits the units of work to a new executor and waits for the finalization.
> The patch was created in the trunk, so I don't know the exact affected version, I assume
that is 0.8.
> I'll try to apply this patch to 0.6.X also for my current production installation, and
then reattach it.
> (1) http://markmail.org/thread/cldnqfh3s3nufnke

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message