Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0422E108C1 for ; Fri, 23 Aug 2013 08:02:35 +0000 (UTC) Received: (qmail 42598 invoked by uid 500); 23 Aug 2013 08:02:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42578 invoked by uid 500); 23 Aug 2013 08:02:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42570 invoked by uid 99); 23 Aug 2013 08:02:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Aug 2013 08:02:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arodrime@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-la0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Aug 2013 08:02:25 +0000 Received: by mail-la0-f48.google.com with SMTP id er20so229381lab.21 for ; Fri, 23 Aug 2013 01:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=Goq412v6R+zX9NgXsN+iW3kcVjQrqEalasw031Y7oEQ=; b=HpBM0zqNVKts8BkxkI63x35pf/qmj1EsqO6CEEDK6mvI+6EtRM8+kLn5GmQq9JxvTz oO9fqPCAyDcAo4yXV56xuQaxdpoSkqH4T79WQbuLFVgMLHVKU54zEaAdwhGKdDNlzZwk r1N9aoHGI2HrEGurOcXo3YNdEdv07rrFwjcuuTGtI0MmIOl64wgN29rd9rJc/w975Nbh DwQLdPl+rxM6x+Gkc9Fu42D2Pp1nYYZTEiZELSzoWCrxzqSocItnFH0U1xTUpZ9NhjA2 gqEgqlUR2j+sZ1s4kyMHttU05+TYbtinpJWhmzFOzgaK1JfiovYxq84FQhOX6w60YIy/ RiiQ== X-Received: by 10.152.2.226 with SMTP id 2mr14251558lax.14.1377244925062; Fri, 23 Aug 2013 01:02:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.18.231 with HTTP; Fri, 23 Aug 2013 01:01:44 -0700 (PDT) From: Alain RODRIGUEZ Date: Fri, 23 Aug 2013 10:01:44 +0200 Message-ID: Subject: Periodical deletes and compaction strategy To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e013c6a4e86f47104e498d203 X-Virus-Checked: Checked by ClamAV on apache.org --089e013c6a4e86f47104e498d203 Content-Type: text/plain; charset=ISO-8859-1 Hi, I am currently using about 10 CF to store temporal data. Those data are growing pretty big (hundreds of GB when I actually only need information from the last month - i.e. about hundreds of MB). I am going to delete old (and useless) data, I cannot always use TTL since I have counters too. Yet I know that deletes are a bit tricky in Cassandra, due to the fact that they are distributed. I was wondering about the best way to keep high performance and get rid of tombstones easily. I was considering 2 ways to do it : - Major compaction on these 10 CF to force them to always keep fresh data only and remove tombstones - LCS to have more chance to get all parts of the row in one SSTable, allowing tombstones to be removed eventually. What would be the better option (i.e. what would be the impact of both solutions) ? Do you need more information about those CF to answer this question ? Any insight is welcome, as always. Alain --089e013c6a4e86f47104e498d203 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

I am currently using abo= ut 10 CF to store temporal data. Those data are growing pretty big (hundred= s of GB when I actually only need information from the last month - i.e. ab= out hundreds of MB).

I am going to delete old (and useless) data, I cannot a= lways use TTL since I have counters too. Yet I know that deletes are a bit = tricky in Cassandra, due to the fact that they are distributed.

I was wondering about the best way to keep high performance = and get rid of tombstones easily.

I was considerin= g 2 ways to do it :

- Major compaction on these 10= CF to force them to always keep fresh data only and remove tombstones
- LCS to have more chance to get all parts of the row in one SSTable, = allowing tombstones to be removed eventually.

What= would be the better option (i.e. what would be the impact of both solution= s) ?
Do you need more information about those CF to answer this question ?<= /div>

Any insight is welcome, as always.

<= /div>
Alain
--089e013c6a4e86f47104e498d203--