Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EBDCD1793E for ; Sun, 22 Feb 2015 20:37:12 +0000 (UTC) Received: (qmail 41014 invoked by uid 500); 22 Feb 2015 20:37:12 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 40969 invoked by uid 500); 22 Feb 2015 20:37:12 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 40958 invoked by uid 99); 22 Feb 2015 20:37:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Feb 2015 20:37:12 +0000 Date: Sun, 22 Feb 2015 20:37:12 +0000 (UTC) From: "Carl Yeksigian (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7409) Allow multiple overlapping sstables in L1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332345#comment-14332345 ] Carl Yeksigian commented on CASSANDRA-7409: ------------------------------------------- I've pushed up an updated branch which addresses these concerns. I can rebase if it looks good. The reason that I used the sstable count instead of size in total bytes is I'm trying to find a level which has a lot of small files. If the level is oversized, it will go through a normal compaction, but if there are too many sstables, we don't catch that anywhere. It was originally in case we had a situation like in L0 where you write a lot of small files, they get compacted together and produce another small file, and the compaction doesn't include other L1 files so that there is either a small number or a larger file. I like the ideas for the improvements; both definitely worth investigating. I'll discuss a plan for testing this with [~enigmacurry] this week. > Allow multiple overlapping sstables in L1 > ----------------------------------------- > > Key: CASSANDRA-7409 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7409 > Project: Cassandra > Issue Type: Improvement > Reporter: Carl Yeksigian > Assignee: Carl Yeksigian > Labels: compaction > Fix For: 3.0 > > > Currently, when a normal L0 compaction takes place (not STCS), we take up to MAX_COMPACTING_L0 L0 sstables and all of the overlapping L1 sstables and compact them together. If we didn't have to deal with the overlapping L1 tables, we could compact a higher number of L0 sstables together into a set of non-overlapping L1 sstables. > This could be done by delaying the invariant that L1 has no overlapping sstables. Going from L1 to L2, we would be compacting fewer sstables together which overlap. > When reading, we will not have the same one sstable per level (except L0) guarantee, but this can be bounded (once we have too many sets of sstables, either compact them back into the same level, or compact them up to the next level). > This could be generalized to allow any level to be the maximum for this overlapping strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)