cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1876) Allow minor Parallel Compaction
Date Mon, 14 Feb 2011 20:04:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994459#comment-12994459
] 

Jonathan Ellis commented on CASSANDRA-1876:
-------------------------------------------

A while ago I said:

bq. ideally we would parallelize within a single sstable (breaking out the deserialize / merge
/ write stages) but this is Hard.

It's hard, but for a lot of users (anyone where a single CF holds the bulk of the data) this
is the only kind of optimization that will make a difference.

There are five stages: read, deserialize, merge, serialize, and write. We probably want to
continue doing read+deserialize and serialize+write together, or you waste a lot copying to/from
buffers.

So, what I would suggest is: one thread per input sstable doing read + deserialize (a row
at a time).  One thread merging corresponding rows from each input sstable.  One thread doing
serialize + writing the output.  This should give us between 2x and 3x speedup (depending
how much doing the merge on another thread than write saves us).

This will require roughly 2x the memory, to allow the reader threads to work ahead of the
merge stage.  (I.e. for each input sstable you will have up to one row in a queue waiting
to be merged, and the reader thread working on the next.)  Seems quite reasonable on that
front.

Multithreaded compaction should be either on or off.  It doesn't make sense to try to do things
halfway (by doing the reads with a
threadpool whose size you can grow/shrink, for instance): we still have compaction threads
tuned to low priority, by default, so the impact on the rest of the system won't be very different.
 Nor do we expect to have so many input sstables that we lose a lot in context switching between
reader threads.  (If this is a concern, we already have a tunable to limit the number of sstables
merged at a time in a single CF.)

IMO it's acceptable to punt completely on rows that are larger than memory, and fall back
to the old non-parallel code there.  I don't see any sane way to parallelize large-row compactions.

> Allow minor Parallel Compaction
> -------------------------------
>
>                 Key: CASSANDRA-1876
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Germán Kondolf
>            Priority: Minor
>         Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, compactionPatch-V3.txt
>
>
> Hi,
> According to the dev's list discussion (1) I've patched the CompactionManager to allow
parallel compaction.
> Mainly it splits the sstables to compact in the desired buckets, configured by a new
parameter: compaction_parallelism with the current default of "1".
> Then, it just submits the units of work to a new executor and waits for the finalization.
> The patch was created in the trunk, so I don't know the exact affected version, I assume
that is 0.8.
> I'll try to apply this patch to 0.6.X also for my current production installation, and
then reattach it.
> (1) http://markmail.org/thread/cldnqfh3s3nufnke

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message