Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA99011C6C for ; Wed, 4 Jun 2014 17:04:16 +0000 (UTC) Received: (qmail 11602 invoked by uid 500); 4 Jun 2014 17:04:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 11567 invoked by uid 500); 4 Jun 2014 17:04:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 11559 invoked by uid 99); 4 Jun 2014 17:04:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2014 17:04:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of redmumba@gmail.com designates 209.85.216.54 as permitted sender) Received: from [209.85.216.54] (HELO mail-qa0-f54.google.com) (209.85.216.54) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2014 17:04:10 +0000 Received: by mail-qa0-f54.google.com with SMTP id j15so7325201qaq.41 for ; Wed, 04 Jun 2014 10:03:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=omnJaLbdaPXUQAxk0ss3GjvspkxRAy2LE/CjfAPXQo0=; b=RLfyYEe5eCotNO6B9r1P2yc3WYkWJo7DP7F+/aVKHIBa6k42PcrpgCa1q36vebzDxB uSwkTTJL/359nIJup+O2iuVBy6NlcP5Wgb3ukWGJgmiM+856P1Z1tQ2BeJLZsJioFnM0 pXl/tEyiWVvbwr9R/5viyFHoWA0InRR9OnRqtp7XRgFNz9cR24Xm8fPPvhIsmbIfeKI4 HxBtPQ+t+drwMGSikwTB3KVSKRiu+TFM7ypCyhPwyZtwWwml8CnEOyA1NqX7P8t8L8r/ x/h550O7j2eTol7/xTV5jhPpGl8vQaql17jd6tcBWTmPmbqvimPyi/7t4Q2gCX4TPpNO V5RQ== X-Received: by 10.224.34.129 with SMTP id l1mr8215275qad.104.1401901426854; Wed, 04 Jun 2014 10:03:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.178.71 with HTTP; Wed, 4 Jun 2014 10:03:26 -0700 (PDT) In-Reply-To: References: From: Redmumba Date: Wed, 4 Jun 2014 10:03:26 -0700 Message-ID: Subject: Re: Customized Compaction Strategy: Dev Questions To: Russell Bradberry Cc: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bfe9b0a8ed14a04fb059c62 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bfe9b0a8ed14a04fb059c62 Content-Type: text/plain; charset=UTF-8 Not quite; if I'm at say 90% disk usage, I'd like to drop the oldest sstable rather than simply run out of space. The problem with using TTLs is that I have to try and guess how much data is being put in--since this is auditing data, the usage can vary wildly depending on time of year, verbosity of auditing, etc.. I'd like to maximize the disk space--not optimize the cleanup process. Andrew On Wed, Jun 4, 2014 at 9:47 AM, Russell Bradberry wrote: > You mean this: > > https://issues.apache.org/jira/browse/CASSANDRA-5228 > > ? > > > > On June 4, 2014 at 12:42:33 PM, Redmumba (redmumba@gmail.com) wrote: > > Good morning! > > I've asked (and seen other people ask) about the ability to drop old > sstables, basically creating a FIFO-like clean-up process. Since we're > using Cassandra as an auditing system, this is particularly appealing to us > because it means we can maximize the amount of auditing data we can keep > while still allowing Cassandra to clear old data automatically. > > My idea is this: perform compaction based on the range of dates available > in the sstable (or just metadata about when it was created). For example, > a major compaction could create a combined sstable per day--so that, say, > 60 days of data after a major compaction would contain 60 sstables. > > My question then is, will this be possible by simply implementing a > separate AbstractCompactionStrategy? Does this sound feasilble at all? > Based on the implementation of Size and Leveled strategies, it looks like I > would have the ability to control what and how things get compacted, but I > wanted to verify before putting time into it. > > Thank you so much for your time! > > Andrew > > --047d7bfe9b0a8ed14a04fb059c62 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Not quite; if I'm at say 90% disk usage, I'd = like to drop the oldest sstable rather than simply run out of space.
The problem with using TTLs is that I have to try and guess how much data = is being put in--since this is auditing data, the usage can vary wildly dep= ending on time of year, verbosity of auditing, etc..=C2=A0 I'd like to = maximize the disk space--not optimize the cleanup process.

Andrew


On Wed, Jun 4, 2014 at 9:47 AM, Russell Bradberry <rbradbe= rry@gmail.com> wrote:
You mean this:


?
<= div>


On June 4, 2014 at 12:42:33 PM, Redmumba (redmumba@gmail.com)= wrote:

Good morning!

I've asked (and seen other people ask) about the ability to drop old sstables, basically creating a FIFO-like clean-up process.=C2=A0 Since we're using Cassandra as an auditing system, this is particularly appealing to us because it means we can maximize the amount of auditing data we can keep while still allowing Cassandra to clear old data automatically.

My idea is this: perform compaction based on the range of dates available in the sstable (or just metadata about when it was created).=C2=A0 For example, a major compaction could create a combined sstable per day--so that, say, 60 days of data after a major compaction would contain 60 sstables.

My question then is, will this be possible by simply implementing a separate AbstractCompactionStrategy?=C2=A0 Does this sound feasilble at all?=C2=A0 Based on the implementation of Size and Leveled strategies, it looks like I would have the ability to control what and how things get compacted, but I wanted to verify before putting time into it.

Thank you so much for your time!

Andrew

--047d7bfe9b0a8ed14a04fb059c62--