Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C435200B8D for ; Tue, 16 Aug 2016 05:06:26 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 89228160ABC; Tue, 16 Aug 2016 03:06:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DBE88160AB9 for ; Tue, 16 Aug 2016 05:06:25 +0200 (CEST) Received: (qmail 15587 invoked by uid 500); 16 Aug 2016 03:06:24 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 15465 invoked by uid 99); 16 Aug 2016 03:06:24 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2016 03:06:24 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CEBDF2C02AE for ; Tue, 16 Aug 2016 03:06:22 +0000 (UTC) Date: Tue, 16 Aug 2016 03:06:22 +0000 (UTC) From: "Wei Deng (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-12464) Investigate the potential improvement of parallelism on higher level compactions in LCS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 16 Aug 2016 03:06:26 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Deng updated CASSANDRA-12464: --------------------------------- Labels: lcs performance (was: ) > Investigate the potential improvement of parallelism on higher level compactions in LCS > --------------------------------------------------------------------------------------- > > Key: CASSANDRA-12464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12464 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Reporter: Wei Deng > Labels: lcs, performance > > According to LevelDB's design doc [here|https://github.com/google/leveldb/blob/master/doc/impl.html#L115-L116], "A compaction merges the contents of the picked files to produce a > sequence of level-(L+1) files", it will "switch to producing a new > level-(L+1) file after the current output file has reached the target > file size" (in our case 160MB), it will also "switch to a new output file when the key range of the current output file has grown enough to overlap more than ten level-(L+2) files". This is to ensure "that a later compaction > of a level-(L+1) file will not pick up too much data from level-(L+2)." > Our current code in LeveledCompactionStrategy doesn't implement this last rule, but we might be able to quickly implement it and see how much a compaction throughput improvement it can deliver. Potentially we can create a scenario where a number of large L0 SSTables are present (e.g. 200GB after switching from STCS) and let it to create thousands of L1 SSTables overflow, and see how fast LCS can digest this much data from L1 and properly upper-level them to completion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)