Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3F6510FDB for ; Mon, 26 Aug 2013 19:10:49 +0000 (UTC) Received: (qmail 24738 invoked by uid 500); 26 Aug 2013 19:10:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 24643 invoked by uid 500); 26 Aug 2013 19:10:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24635 invoked by uid 99); 26 Aug 2013 19:10:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 19:10:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kohlisankalp@gmail.com designates 209.85.216.175 as permitted sender) Received: from [209.85.216.175] (HELO mail-qc0-f175.google.com) (209.85.216.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 19:10:39 +0000 Received: by mail-qc0-f175.google.com with SMTP id m4so1996030qcy.20 for ; Mon, 26 Aug 2013 12:10:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=joxHrL60mV7ibMt6FGsaNvRcETo26OULCVqT6CedmA0=; b=sF4TCIri65xsNlgy4HNIu1Gh279thEwV4mBYCLOKYp6p83mFXu0RS96drf5wFED54d RC3Cvnvpk4HAuF6Vn//bWMhtkHFWN7IZtIXdLvr7zFBW3LVbCzXxgVKDMvcCO9cUrPSJ s1SsP3GwniQ2kevUwbQv9pBZH64yCY+7j+cegmkcgRvgF5hV39mbqA9100Wbr56W16Eq brGS+KUXWDvxnXwww9d6YA2wjG0sWmcINAGkKkDlqI/aQqdo2Mavdwo3DNQFzFS56UzQ ir4MGHThQAlBHg0gh+8pML6GRsbBhH7Qi48sgOzER4OifIxJg4v1xFgm4EHXXqfiX7ZD DwzA== X-Received: by 10.224.50.200 with SMTP id a8mr4937006qag.79.1377544218810; Mon, 26 Aug 2013 12:10:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.49.131.229 with HTTP; Mon, 26 Aug 2013 12:09:38 -0700 (PDT) In-Reply-To: References: From: sankalp kohli Date: Mon, 26 Aug 2013 12:09:38 -0700 Message-ID: Subject: Re: Periodical deletes and compaction strategy To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bf16090d31fb204e4de815d X-Virus-Checked: Checked by ClamAV on apache.org --047d7bf16090d31fb204e4de815d Content-Type: text/plain; charset=ISO-8859-1 The problem is that tombstones will hang in there till GC grace period. You can reduce the GC grace period and then catch lost deletes in the application layer if you know you should not be seeing such an old record. Also in 1.2, they have some setting which enable an sstable to be compacted if it has lot of tombstones. You might want to look at that. On Mon, Aug 26, 2013 at 3:31 AM, cem wrote: > Hi Alain, > > I solved the same issue by implementing a client that manages time range > partitions. Each time range partition is a CF. > > Cem. > > > On Mon, Aug 26, 2013 at 11:34 AM, Alain RODRIGUEZ wrote: > >> Hi, >> >> Any guidance on this topic would be appreciated :). >> >> >> 2013/8/23 Alain RODRIGUEZ >> >>> Hi, >>> >>> I am currently using about 10 CF to store temporal data. Those data are >>> growing pretty big (hundreds of GB when I actually only need information >>> from the last month - i.e. about hundreds of MB). >>> >>> I am going to delete old (and useless) data, I cannot always use TTL >>> since I have counters too. Yet I know that deletes are a bit tricky in >>> Cassandra, due to the fact that they are distributed. >>> >>> I was wondering about the best way to keep high performance and get rid >>> of tombstones easily. >>> >>> I was considering 2 ways to do it : >>> >>> - Major compaction on these 10 CF to force them to always keep fresh >>> data only and remove tombstones >>> - LCS to have more chance to get all parts of the row in one SSTable, >>> allowing tombstones to be removed eventually. >>> >>> What would be the better option (i.e. what would be the impact of both >>> solutions) ? >>> Do you need more information about those CF to answer this question ? >>> >>> Any insight is welcome, as always. >>> >>> Alain >>> >> >> > --047d7bf16090d31fb204e4de815d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
The problem is that tombstones will hang in there till GC = grace period. You can reduce the GC grace period and then catch lost delete= s in the application layer if you know you should not be seeing such an old= record.=A0

Also in 1.2, they have some setting which enable an ss= table to be compacted if it has lot of tombstones. You might want to look a= t that.=A0


On Mon, Aug 26, 2013 at 3:31 AM, cem <cayiroglu@gmail.com>= wrote:
Hi Alain,

I solved the same issue by im= plementing a client that manages time range partitions. Each time range par= tition is a CF.
Cem.


On Mon, Aug 26, 2013 at 11:34 AM, Alain = RODRIGUEZ <arodrime@gmail.com> wrote:
Hi,

Any guidance on this topic wo= uld be appreciated :).

2013/8/23 Alain RODRIGUEZ &l= t;arodrime@gmail.co= m>
Hi,

I am currently using about 10 CF to store temporal data. Those data = are growing pretty big (hundreds of GB when I actually only need informatio= n from the last month - i.e. about hundreds of MB).

I am going to delete old (and useless) data, I cannot a= lways use TTL since I have counters too. Yet I know that deletes are a bit = tricky in Cassandra, due to the fact that they are distributed.

I was wondering about the best way to keep high performance = and get rid of tombstones easily.

I was considerin= g 2 ways to do it :

- Major compaction on these 10= CF to force them to always keep fresh data only and remove tombstones
- LCS to have more chance to get all parts of the row in one SSTable, = allowing tombstones to be removed eventually.

What= would be the better option (i.e. what would be the impact of both solution= s) ?
Do you need more information about those CF to answer this question ?<= /div>

Any insight is welcome, as always.

Alain



--047d7bf16090d31fb204e4de815d--