Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A77A27FA5 for ; Tue, 6 Sep 2011 21:17:34 +0000 (UTC) Received: (qmail 9179 invoked by uid 500); 6 Sep 2011 21:17:34 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 9120 invoked by uid 500); 6 Sep 2011 21:17:34 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 9112 invoked by uid 99); 6 Sep 2011 21:17:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 21:17:34 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 21:17:31 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 94708845CF for ; Tue, 6 Sep 2011 21:17:10 +0000 (UTC) Date: Tue, 6 Sep 2011 21:17:10 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1627053936.22251.1315343830604.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <968133626.16119.1310699100244.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2901: -------------------------------------- Attachment: 2901-trunk.txt > Allow taking advantage of multiple cores while compacting a single CF > --------------------------------------------------------------------- > > Key: CASSANDRA-2901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2901 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Jonathan Ellis > Priority: Minor > Fix For: 1.0 > > Attachments: 2901-0.8.txt, 2901-trunk.txt > > > Moved from CASSANDRA-1876: > There are five stages: read, deserialize, merge, serialize, and write. We probably want to continue doing read+deserialize and serialize+write together, or you waste a lot copying to/from buffers. > So, what I would suggest is: one thread per input sstable doing read + deserialize (a row at a time). A thread pool (one per core?) merging corresponding rows from each input sstable. One thread doing serialize + writing the output (this has to wait for the merge threads to complete in-order, obviously). This should take us from being CPU bound on SSDs (since only one core is compacting) to being I/O bound. > This will require roughly 2x the memory, to allow the reader threads to work ahead of the merge stage. (I.e. for each input sstable you will have up to one row in a queue waiting to be merged, and the reader thread working on the next.) Seems quite reasonable on that front. You'll also want a small queue size for the serialize-merged-rows executor. > Multithreaded compaction should be either on or off. It doesn't make sense to try to do things halfway (by doing the reads with a > threadpool whose size you can grow/shrink, for instance): we still have compaction threads tuned to low priority, by default, so the impact on the rest of the system won't be very different. Nor do we expect to have so many input sstables that we lose a lot in context switching between reader threads. > IMO it's acceptable to punt completely on rows that are larger than memory, and fall back to the old non-parallel code there. I don't see any sane way to parallelize large-row compactions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira