cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Germán Kondolf (JIRA) <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-1876) Allow minor Parallel Compaction
Date Tue, 15 Feb 2011 02:00:57 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994574#comment-12994574
] 

Germán Kondolf edited comment on CASSANDRA-1876 at 2/15/11 2:00 AM:
--------------------------------------------------------------------

I like the whole idea.

bq. we still have compaction threads tuned to low priority, by default, so the impact on the
rest of the system won't be very different. Nor do we expect to have so many input sstables
that we lose a lot in context switching between reader threads.

I've tried a patch to set the I/O Priority too (in a 0.6.x patch), maybe we could add that
configuration to the compaction process, to keep the low impact.
It's bounded to Linux OS, but I think that's not a problem.

Also, using that queue in between, but limited to a configurable value, the producer stage
(MERGE) will wait to an available place in queue.
We could tune up how much newly merged rows we want to buffer before we write them to disk,
and indirectly, control the memory used in the process.

If you want, let me update the trunk and prepare a draft-patch. 
What do you think?

      was (Author: germanklf):
    I like the whole idea.

bq. we still have compaction threads tuned to low priority, by default, so the impact on the
rest of the system won't be very different. Nor do we expect to have so many input sstables
that we lose a lot in context switching between reader threads.

I've tried a patch to set the I/O Priority too (in a 0.6.x patch), maybe we could add that
configuration to the compaction process, to keep the the low impact.
It's bounded to Linux OS, but I think that's not a problem.

Also, using that queue in between, but limited to a configurable value, the producer stage
(MERGE) will wait to an available place in queue.
We could tune up how much newly merged rows we want to buffer before we write them to disk,
and indirectly, control the memory used in the process.

If you want, let me update the trunk and prepare a draft-patch. 
What do you think?
  
> Allow minor Parallel Compaction
> -------------------------------
>
>                 Key: CASSANDRA-1876
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Germán Kondolf
>            Priority: Minor
>         Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, compactionPatch-V3.txt
>
>
> Hi,
> According to the dev's list discussion (1) I've patched the CompactionManager to allow
parallel compaction.
> Mainly it splits the sstables to compact in the desired buckets, configured by a new
parameter: compaction_parallelism with the current default of "1".
> Then, it just submits the units of work to a new executor and waits for the finalization.
> The patch was created in the trunk, so I don't know the exact affected version, I assume
that is 0.8.
> I'll try to apply this patch to 0.6.X also for my current production installation, and
then reattach it.
> (1) http://markmail.org/thread/cldnqfh3s3nufnke

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message