cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF
Date Wed, 03 Aug 2011 05:05:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078591#comment-13078591
] 

Jonathan Ellis commented on CASSANDRA-2901:
-------------------------------------------

Split out some fixes to the SSTII bytes tracker getting out of sync w/ the underlying stream,
and did some cleanup to make the streamed/file versions less divergent.

Also adds parallel compaction testing to LazilyCompactedRowTest.

CliTest and DefsTest generate compaction loads (in DefsTest's case, on the Migrations CF --
haven't dug into CliTest as much) that break w/ parallel enabled, although the test doesn't
actually fail (argh).

Haven't figured out what's causing that, and haven't come up with a way to reproduce in a
"real" test yet.  The DefsTest does mix lazy/nonlazy iteration in the merge, which may be
relevant.

bq. I'm also no proposing to complicate things.  

You're right, poor choice of words on my part.

Latest gives the merge executor a SynchronousQueue.  I think that's a better way to cut worst-case,
than the Deserializer, for the reason given previously.

bq. 'if...instanceof' business is a bit error prone/ugly

Agreed. Added getColumnCount + reset to ICountableColumnIterator sub-interface.

bq. say how multithreaded_compaction is different from concurrent_compactors and that multithread_compaction
is likely only useful for SSDs in cassandra.yaml

done

bq. The bytesRead "race" should also be fixed in CompactionIterable

done

bq. I would have put the code in CompactedRow.close() at the end of the LCR.write() instead
of adding a new method, as it avoids forgetting calling close 

I did consider that, but it feels weird to me to have write implicitly call close.  I guess
we could just change the method name? :)

bq. We can make PreCompactedRow.removeDeletedAndOldShards a public method and use it in PCI.MergeTask

done

> Allow taking advantage of multiple cores while compacting a single CF
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-2901
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2901
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.8.4
>
>         Attachments: 0001-fix-tracker-getting-out-of-sync-with-underlying-data-s.txt,
0002-parallel-compactions.txt
>
>
> Moved from CASSANDRA-1876:
> There are five stages: read, deserialize, merge, serialize, and write. We probably want
to continue doing read+deserialize and serialize+write together, or you waste a lot copying
to/from buffers.
> So, what I would suggest is: one thread per input sstable doing read + deserialize (a
row at a time). A thread pool (one per core?) merging corresponding rows from each input sstable.
One thread doing serialize + writing the output (this has to wait for the merge threads to
complete in-order, obviously). This should take us from being CPU bound on SSDs (since only
one core is compacting) to being I/O bound.
> This will require roughly 2x the memory, to allow the reader threads to work ahead of
the merge stage. (I.e. for each input sstable you will have up to one row in a queue waiting
to be merged, and the reader thread working on the next.) Seems quite reasonable on that front.
 You'll also want a small queue size for the serialize-merged-rows executor.
> Multithreaded compaction should be either on or off. It doesn't make sense to try to
do things halfway (by doing the reads with a
> threadpool whose size you can grow/shrink, for instance): we still have compaction threads
tuned to low priority, by default, so the impact on the rest of the system won't be very different.
Nor do we expect to have so many input sstables that we lose a lot in context switching between
reader threads.
> IMO it's acceptable to punt completely on rows that are larger than memory, and fall
back to the old non-parallel code there. I don't see any sane way to parallelize large-row
compactions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message