Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0190C11172 for ; Wed, 24 Sep 2014 18:28:36 +0000 (UTC) Received: (qmail 41817 invoked by uid 500); 24 Sep 2014 18:28:35 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 41782 invoked by uid 500); 24 Sep 2014 18:28:35 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 41769 invoked by uid 99); 24 Sep 2014 18:28:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Sep 2014 18:28:35 +0000 Date: Wed, 24 Sep 2014 18:28:35 +0000 (UTC) From: "Marcus Eriksson (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7019) Major tombstone compaction MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146685#comment-14146685 ] Marcus Eriksson commented on CASSANDRA-7019: -------------------------------------------- [~kohlisankalp] ill post a proof of concept patch for option 1 in the description tomorrow, idea is to basically run a major compaction, but have the compaction strategy decide on an 'optimal' sstable distribution for the strategy instead of just creating a big one, for LCS it simply fills levels from level 1 and up. For STCS it will create sstables where one has 50%, one 25% of the data, etc until the sstables get too small. This is mostly for the "oh crap we have a ton of tombstones and need to get rid of them"-case, not for the day-to-day case, need to figure out something more for that (like your idea perhaps) > Major tombstone compaction > -------------------------- > > Key: CASSANDRA-7019 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7019 > Project: Cassandra > Issue Type: Improvement > Reporter: Marcus Eriksson > Assignee: Marcus Eriksson > Labels: compaction > > It should be possible to do a "major" tombstone compaction by including all sstables, but writing them out 1:1, meaning that if you have 10 sstables before, you will have 10 sstables after the compaction with the same data, minus all the expired tombstones. > We could do this in two ways: > # a nodetool command that includes _all_ sstables > # once we detect that an sstable has more than x% (20%?) expired tombstones, we start one of these compactions, and include all overlapping sstables that contain older data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)