Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 83802176E9 for ; Thu, 25 Sep 2014 14:55:35 +0000 (UTC) Received: (qmail 1545 invoked by uid 500); 25 Sep 2014 14:55:35 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 1505 invoked by uid 500); 25 Sep 2014 14:55:35 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 1492 invoked by uid 99); 25 Sep 2014 14:55:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Sep 2014 14:55:35 +0000 Date: Thu, 25 Sep 2014 14:55:35 +0000 (UTC) From: "Marcus Eriksson (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7019) Major tombstone compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147823#comment-14147823 ] Marcus Eriksson commented on CASSANDRA-7019: -------------------------------------------- branch here: https://github.com/krummas/cassandra/commits/marcuse/7019-2 triggered with nodetool compact -o It writes fully compacted partitions - each partition will only be in one single sstable - my first idea was to put the cells back in the corresponding files where they were found (minus tombstones), but it felt wrong to not actually write the compacted partition out when we have it. LCS: * creates an 'optimal' leveling - it takes all existing files, compacts them, and starts filling each level from L0 up ** note that (if we have token range 0 -> 1000) L1 will get tokens 0->10, L2 11->100 and L3 101 -> 1000. Not though much about if this is good/bad for future compactions. STCS: * calculates an 'optimal' distribution of sstables, currently it makes them 50%, 25%, 12.5% ... of total data size until the smallest sstable would be sub 50MB, then puts all the rest in the last sstable. If anyone has a more optimal sstable distribution, please let me know ** the sstables will be non-overlapping, it starts writing the biggest sstable first and continues with the rest once 50% is in that > Major tombstone compaction > -------------------------- > > Key: CASSANDRA-7019 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7019 > Project: Cassandra > Issue Type: Improvement > Reporter: Marcus Eriksson > Assignee: Marcus Eriksson > Labels: compaction > > It should be possible to do a "major" tombstone compaction by including all sstables, but writing them out 1:1, meaning that if you have 10 sstables before, you will have 10 sstables after the compaction with the same data, minus all the expired tombstones. > We could do this in two ways: > # a nodetool command that includes _all_ sstables > # once we detect that an sstable has more than x% (20%?) expired tombstones, we start one of these compactions, and include all overlapping sstables that contain older data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)