Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C01EAEE0A for ; Sun, 17 Feb 2013 17:16:48 +0000 (UTC) Received: (qmail 48638 invoked by uid 500); 17 Feb 2013 17:16:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 48552 invoked by uid 500); 17 Feb 2013 17:16:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 48544 invoked by uid 99); 17 Feb 2013 17:16:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Feb 2013 17:16:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a58.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Feb 2013 17:16:38 +0000 Received: from homiemail-a58.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a58.g.dreamhost.com (Postfix) with ESMTP id 50E377D806D for ; Sun, 17 Feb 2013 09:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=MbhQuXp5k9Kv09Zj/CEIlcxmu5 s=; b=tK/ua2rrTlkoVtb9Fp8cj4diz3xadxzkgiLzG02KTla8TBeePlvhba4CnQ d4EDTAriXjnZLiA3OmDH7ro+p+9nTgSxCokoZcGqDCZifAOGNGP3sOPFZVZ2dHmb +19EH70r5mTFP85Wjua0ZruYQh7HUIF9yjnd9mgIAJF3OEan0= Received: from [172.16.1.8] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a58.g.dreamhost.com (Postfix) with ESMTPSA id D90A47D806A for ; Sun, 17 Feb 2013 09:16:22 -0800 (PST) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_867AD3D0-6C47-4327-8FDC-8B82BE89F236" Message-Id: <04AE97AC-8059-40EC-8077-A7E53366BBF5@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Deleting old items during compaction (WAS: Deleting old items) Date: Mon, 18 Feb 2013 06:16:19 +1300 References: <01c101ce09d2$237a1ab0$6a6e5010$@metricshub.com> To: user@cassandra.apache.org In-Reply-To: <01c101ce09d2$237a1ab0$6a6e5010$@metricshub.com> X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_867AD3D0-6C47-4327-8FDC-8B82BE89F236 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 That's what the TTL does.=20 Manually delete all the older data now, then start using TTL.=20 Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/02/2013, at 11:08 PM, Ilya Grebnov wrote: > Hi, > =20 > We looking for solution for same problem. We have a wide column family = with counters and we want to delete old data like 1 months old. One of = potential ideas was to implement hook in compaction code and drop column = which we don=92t need. Is this a viable option? > =20 > Thanks, > Ilya > From: aaron morton [mailto:aaron@thelastpickle.com]=20 > Sent: Tuesday, February 12, 2013 9:01 AM > To: user@cassandra.apache.org > Subject: Re: Deleting old items > =20 > So is it possible to delete all the data inserted in some CF between 2 = dates or data older than 1 month ? > No.=20 > =20 > You need to issue row level deletes. If you don't know the row key = you'll need to do range scans to locate them.=20 > =20 > If you are deleting parts of wide rows consider reducing the = min_compaction_level_threshold on the CF to 2 > =20 > Cheers > =20 > =20 > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > =20 > @aaronmorton > http://www.thelastpickle.com > =20 > On 12/02/2013, at 4:21 AM, Alain RODRIGUEZ wrote: >=20 >=20 > Hi, > =20 > I would like to know if there is a way to delete old/unused data = easily ? > =20 > I know about TTL but there are 2 limitations of TTL: > =20 > - AFAIK, there is no TTL on counter columns > - TTL need to be defined at write time, so it's too late for data = already inserted. > =20 > I also could use a standard "delete" but it seems inappropriate for = such a massive. > =20 > In some cases, I don't know the row key and would like to delete all = the rows starting by, let's say, "1050#..."=20 > =20 > Even better, I understood that columns are always inserted in C* with = (name, value, timestamp). So is it possible to delete all the data = inserted in some CF between 2 dates or data older than 1 month ? > =20 > Alain > =20 --Apple-Mail=_867AD3D0-6C47-4327-8FDC-8B82BE89F236 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 That's what the TTL = does. 

Manually delete all the older data now, = then start using = TTL. 

Cheers

http://www.thelastpickle.com

On 13/02/2013, at 11:08 PM, Ilya Grebnov <ilya@metricshub.com> = wrote:

 
We looking for solution for same problem. We = have a wide column family with counters and we want to delete old data = like 1 months old. One of potential ideas was to implement hook in = compaction code and drop column which we don=92t need. Is this a viable = option?
 
Thanks,
 aaron = morton [mailto:aaron@thelastpickle.com] 
Sent: Tuesday, February 12, 2013 = 9:01 AM
To:  
Re: Deleting old = items
So is it possible to = delete all the data inserted in some CF between 2 dates or data older = than 1 month ?
You = need to issue row level deletes. If you don't know the row key you'll = need to do range scans to locate = them. 
If = you are deleting parts of wide rows consider reducing the = min_compaction_level_threshold on the CF to = 2
Aaron = Morton
New = Zealand
I = would like to know if there is a way to delete old/unused data easily = ?
I = know about TTL but there are 2 limitations of = TTL:
- = AFAIK, there is no TTL on counter = columns
- = TTL need to be defined at write time, so it's too late for data already = inserted.
I = also could use a standard "delete" but it seems inappropriate for = such a massive.
In = some cases, I don't know the row key and would like to delete all the = rows starting by, let's say, = "1050#..." 
Even = better, I understood that columns are always inserted in C* with = (name, value, timestamp). So is it possible to delete all the data = inserted in some CF between 2 dates or data older than 1 month = ?