Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 928EB11793 for ; Wed, 7 May 2014 00:15:49 +0000 (UTC) Received: (qmail 70531 invoked by uid 500); 7 May 2014 00:15:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 70448 invoked by uid 500); 7 May 2014 00:15:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 70440 invoked by uid 99); 7 May 2014 00:15:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 May 2014 00:15:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paulo.motta@chaordicsystems.com designates 209.85.213.47 as permitted sender) Received: from [209.85.213.47] (HELO mail-yh0-f47.google.com) (209.85.213.47) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 May 2014 00:15:40 +0000 Received: by mail-yh0-f47.google.com with SMTP id a41so228543yho.34 for ; Tue, 06 May 2014 17:15:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=TLHIxp+LBebeS8uGoZ6owlJRh3ksRZTB47lcYEwnXKs=; b=cCEVf9EnjtraeJC5QVn5h9yBm+Cl0xEZjCrIs3YNVIInjcn1J1JgDu1RRn3TzfS4ii 3oXcc1BpokmUDUWGWHPnYZ/fdlpL4vq1G+cYR5A3V8FVA9vwX4Ybe2ru1a+QVSVlz4SR UyRVm3bRVNUMDz2TR3Zq0q9ps7P6Ygmh8+l/GkycV4spS7jvsxLtycJVZMmmy8KadcEM Hl7P4JhGqDTFL93BgGcf9Y2UxT6JdmXHqtu+LFVMo/E17MpGtSPYmus6QTq2zwd0Ho/i xKt8Tb3no1BKH7cRTLmmzPqAFBjuYibFPjqKmKY6lJMFdcFpCETbDjZ8HurOLk6Zx9+l JzjA== X-Gm-Message-State: ALoCoQlYVtmTnNpOMXggE1t5YaIgjVHspXYq9I/0+bQB2wDcq0CVj+3J1kd+iXRcKvUCLLPL3hXS X-Received: by 10.236.131.42 with SMTP id l30mr63341025yhi.130.1399421719459; Tue, 06 May 2014 17:15:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.170.168.87 with HTTP; Tue, 6 May 2014 17:14:59 -0700 (PDT) In-Reply-To: References: From: Paulo Ricardo Motta Gomes Date: Tue, 6 May 2014 21:14:59 -0300 Message-ID: Subject: Re: Automatic tombstone removal issue (STCS) To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf301b622b7acfed04f8c442ef X-Virus-Checked: Checked by ClamAV on apache.org --20cf301b622b7acfed04f8c442ef Content-Type: text/plain; charset=UTF-8 Hello, Sorry for being persistent, but I'd love to clear my understanding on this. Has anyone seen single sstable compaction being triggered for STCS sstables with high tombstone ratio? Because if the above understanding is correct, the current implementation almost never triggers this kind of compaction, since the token ranges of a node's sstable almost always overlap. Could this be a bug or is it expected behavior? Thank you, On Mon, May 5, 2014 at 8:59 AM, Paulo Ricardo Motta Gomes < paulo.motta@chaordicsystems.com> wrote: > Hello, > > After noticing that automatic tombstone removal (CASSANDRA-3442) was not > working in an append-only STCS CF with 40% of droppable tombstone ratio I > investigated why the compaction was not being triggered in the largest > SSTable with 16GB and about 70% droppable tombstone ratio. > > When the code goes to check if the SSTable is candidate to be compacted > (AbstractCompactionStrategy.worthDroppingTombstones), it verifies if all > the others SSTables overlap with the current SSTable by checking if the > start and end tokens overlap. The problem is that all SSTables contain > pretty much the whole node token range, so all of them overlap nearly all > the time, so the automatic tombstone removal never happens. Is there any > case in STCS where all sstables token ranges DO NOT overlap? > > I understand during the tombstone removal process it's necessary to verify > if the compacted row exists in any other SSTable, but I don't understand > why it's necessary to verify if the token ranges overlap to decide if a > tombstone compaction must be executed on a single SSTable with high > droppable tombstone ratio. > > Any clarification would be kindly appreciated. > > PS: Cassandra version: 1.2.16 > > -- > *Paulo Motta* > > Chaordic | *Platform* > *www.chaordic.com.br * > +55 48 3232.3200 > -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br * +55 48 3232.3200 --20cf301b622b7acfed04f8c442ef Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello,

Sorry for being persistent, but = I'd love to clear my understanding on this. Has anyone seen single ssta= ble compaction being triggered for STCS sstables with high tombstone ratio?= =C2=A0

Because if the above understanding is correct, the curr= ent implementation almost never triggers this kind of compaction, since the= token ranges of a node's sstable almost always overlap. Could this be = a bug or is it expected behavior?

Thank you,






--
Paulo Motta

Chaordic | Platform
www.chaordic.com.br
+55 48 3232.3200
--20cf301b622b7acfed04f8c442ef--