Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 86A94D2C5 for ; Fri, 21 Sep 2012 16:03:08 +0000 (UTC) Received: (qmail 88320 invoked by uid 500); 21 Sep 2012 16:03:08 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 88284 invoked by uid 500); 21 Sep 2012 16:03:08 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 88042 invoked by uid 99); 21 Sep 2012 16:03:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Sep 2012 16:03:08 +0000 Date: Sat, 22 Sep 2012 03:03:08 +1100 (NCT) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1682391838.108046.1348243388078.JavaMail.jiratomcat@arcas> In-Reply-To: <773378940.41556.1338933383292.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (CASSANDRA-4310) Multiple independent Level Compactions in Parallel MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460582#comment-13460582 ] Jonathan Ellis commented on CASSANDRA-4310: ------------------------------------------- bq. there are no more idle executor threads Worth elaborating that we have slightly conflicting goals here: # we want to keep all {{concurrent_compactor}} threads busy # but, we don't want to enqueue tasks farther in advance than we need to, since more information can become available due to other flushes or compactions finishing in the meantime, that allows us to create a more optimal task So I think what we want is to track "currently being compacted CFs" in CompactionManager. If a CF is currently being compacted, and there are no idle threads, submitBackground can be a no-op; we can wait for the current compaction to finish and re-submit when more information is available. Otherwise, we should submit at least one task to prevent starvation by busier CFs, and more if there are idle threads still. > Multiple independent Level Compactions in Parallel > -------------------------------------------------- > > Key: CASSANDRA-4310 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4310 > Project: Cassandra > Issue Type: New Feature > Components: Core > Affects Versions: 1.0.0 > Reporter: sankalp kohli > Assignee: Yuki Morishita > Labels: compaction, features, leveled, performance, ssd > Fix For: 1.2.1 > > Attachments: 4310.txt > > > Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0. > Here is a solution which will help here and also increase the performance of level compaction. > We can do many compactions in parallel for unrelated data. > 1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible. > 2) We can also do compactions with files in L1 which are not participating in L0 compactions. > This is specially useful if you are using SSD and is not bottlenecked by IO. > I am seeing this issue in my cluster. The compactions pending are more than 50k and the disk usage is not that much(I am using SSD). > I am doing multithreaded to true and also not throttling the IO by putting the value as 0. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira