From user-return-25581-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Apr 18 04:27:00 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 980579CB0 for ; Wed, 18 Apr 2012 04:27:00 +0000 (UTC) Received: (qmail 32332 invoked by uid 500); 18 Apr 2012 04:26:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 32310 invoked by uid 500); 18 Apr 2012 04:26:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32295 invoked by uid 99); 18 Apr 2012 04:26:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2012 04:26:57 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [92.60.177.132] (HELO web1.alefhost.od.ua) (92.60.177.132) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2012 04:26:49 +0000 Received: from [10.0.2.15] (unknown [78.26.128.183]) by web1.alefhost.od.ua (Postfix) with ESMTPSA id 4584924A16 for ; Wed, 18 Apr 2012 07:26:54 +0300 (EEST) Message-ID: <4F8E426B.7000608@4friends.od.ua> Date: Wed, 18 Apr 2012 07:26:19 +0300 From: Igor User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: size tiered compaction - improvement References: <4F7B5524.2090206@filez.com> <4F7B9252.9020001@filez.com> <4F7BD3A3.2040906@4friends.od.ua> <4F8934E9.3030800@filez.com> <4F893EA5.7090407@4friends.od.ua> In-Reply-To: Content-Type: multipart/alternative; boundary="------------020600050602020201010400" This is a multi-part message in MIME format. --------------020600050602020201010400 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Thank you Jonatathan, I missed this point about converting TTL data to tombstones first. When you say: You absolutely can. That's what the "user defined" part is: you give it the exact list of sstables you want compacted. does it mean that I can use list (not just one) of sstables as second parameter for userDefinedCompaction? On 04/18/2012 05:53 AM, Jonathan Ellis wrote: > On Sat, Apr 14, 2012 at 4:08 AM, Igor wrote: >> Assume I insert all my data with TTL=2weeks and let we have sstable A which >> was created week ago at the time T, so I know that right now it contain: >> >> 1) some data that were inserted not later than T and may-be not expired yet >> 2) some amount of data that were already close to expiration due TTL at the >> time T, but still had no chances to be wiped out because up to the current >> moment size-tiered compaction did not involve A into compactions. >> >> Large amount of data from 2) became expired in a week after time T and >> probably passed gc_grace period, so it shoould be wiped at any compaction on >> table A. > Any compaction pass over A will first convert the TTL data into tombstones. > > Then, any subsequent pass that includes A *and all other sstables > containing rows with the same key* will drop the tombstones. > --------------020600050602020201010400 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Thank you Jonatathan, I missed this point about converting TTL data to tombstones first.

When you say:
You absolutely can.  That's what the "user defined" part is: you give
it the exact list of sstables you want compacted.
does it mean that I can use list (not just one) of sstables as second parameter for userDefinedCompaction?

On 04/18/2012 05:53 AM, Jonathan Ellis wrote:
On Sat, Apr 14, 2012 at 4:08 AM, Igor <igor@4friends.od.ua> wrote:
Assume I insert all my data with TTL=2weeks and let we have sstable A which
was created week ago at the time T, so I know that right now it contain:

1) some data that were inserted not later than T and may-be not expired yet
2) some amount of data that were already close to expiration due TTL at the
time T, but still had no chances to be wiped out because up to the current
moment size-tiered compaction did not involve A into compactions.

Large amount of data from 2) became expired in a week after time T and
probably passed gc_grace period, so it shoould be wiped at any compaction on
table A.
Any compaction pass over A will first convert the TTL data into tombstones.

Then, any subsequent pass that includes A *and all other sstables
containing rows with the same key* will drop the tombstones.


--------------020600050602020201010400--