Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 262B211698 for ; Fri, 22 Aug 2014 14:24:57 +0000 (UTC) Received: (qmail 43562 invoked by uid 500); 22 Aug 2014 14:24:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 43521 invoked by uid 500); 22 Aug 2014 14:24:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 43506 invoked by uid 99); 22 Aug 2014 14:24:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 14:24:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of clohfink@blackbirdit.com designates 209.85.213.178 as permitted sender) Received: from [209.85.213.178] (HELO mail-ig0-f178.google.com) (209.85.213.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 14:24:26 +0000 Received: by mail-ig0-f178.google.com with SMTP id uq10so14539691igb.11 for ; Fri, 22 Aug 2014 07:24:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=wv/C1Z7TiUFEOtr4B7EcYohJpQUIKIVZ5g5eU4tkTwo=; b=aDeVVfg+51q5+PDg3CK6IiuI+Z3H5flFVsDo6vrdGJ45y2n3N9edhOE7uReiRA8Qd/ hNHdSOIsZRSKqAgRU97x7aY79NIcoV3KJ2hR1CAXSpON4Ltp3C1KvbhUtF6y8KW/cpM/ jwmxYFdMw2bcUG6iMdYrC0uoMwxL82VIE5gYD5YulEakZWFXuZY492hvLiPknKDlb3lY EG0XnJzq2edIPBozv0tJff7Vi3jHLSYprXE2pra1EvG7GuH3JhqjqSpzreG4C8mBnBva q+1cTxnRYX+sq1RzRNyc6RnGSAR6QD6m/YNzO21pNedh17BZuL43UPTRbZDSVXsNFLTT AIDQ== X-Gm-Message-State: ALoCoQkWaO5YeFPeLrN4N5sO0iXMgWFYtA8rZlbVQtVIpfBQPcXSkDGd3R9grBVdajiTNuGT5204 X-Received: by 10.50.234.193 with SMTP id ug1mr11904614igc.20.1408717464520; Fri, 22 Aug 2014 07:24:24 -0700 (PDT) Received: from [10.10.10.103] (97-86-246-164.dhcp.roch.mn.charter.com. [97.86.246.164]) by mx.google.com with ESMTPSA id i8sm38034383igt.17.2014.08.22.07.24.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 22 Aug 2014 07:24:23 -0700 (PDT) From: Chris Lohfink Content-Type: multipart/alternative; boundary="Apple-Mail=_FB457941-B252-4722-9126-48D5E25C1A51" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: do Cassandra generate a event or log containing key value of column when a column expires due to TTL Date: Fri, 22 Aug 2014 09:24:22 -0500 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1874) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_FB457941-B252-4722-9126-48D5E25C1A51 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Few options I can think of, probably some better ideas out there. These = mostly depending on size of data and how frequently updated. 1) a map reduce or spark job to filter out non-empty rows 2) add some logging and do a custom build of cassandra (ie in = "removeDeletedCF" of ColumnFamilyStore) and grep log files to get a = report that can be consumed by your system. Potentially lotta redundant = data and may be long after actual expiration before shows up. 3) when you insert your TTLed columns you also write to a wide row, with = timestamp of expiration and use that to drive the report... there will = be a bit to that though and a MR job still might be good idea for doing = processing. --- Chris On Aug 22, 2014, at 6:02 AM, Gaurav Bhatnagar = wrote: > Hi, > I have stored following data structure in cassandra >=20 > RowKey: 119551747098 >=20 >=20 > =3D> (name=3Dc:per:@batchId, value=3Dad1, timestamp=3D1408345109805011, = ttl=3D1436489) >=20 > =3D> (name=3Dc:per:@currency, value=3DUSD, timestamp=3D1408345109805009,= ttl=3D1436489) >=20 > =3D> (name=3Dc:per:@decimalValue, value=3D2, = timestamp=3D1408345109805003, ttl=3D1436489) >=20 >=20 >=20 > here Rowkey 119551747098 is a numeric number containing serial number = of data >=20 >=20 >=20 > These columns get expired when ttl value for that column is reached. >=20 > I what to generate an audit trail which contains value of RowKey along = with column name and value when they get deleted due to ttl expiration. >=20 >=20 >=20 > I want this audit trail for reconciliation purpose so that I can know = which all RowKeys have got deleted from the system. >=20 >=20 >=20 > Is there any way in cassandra through which I can print value of = RowKeys which get deleted due TTL expiration. >=20 >=20 >=20 > Regards, >=20 > Gaurav >=20 --Apple-Mail=_FB457941-B252-4722-9126-48D5E25C1A51 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Few = options I can think of, probably some better ideas out there. =  These mostly depending on size of data and how frequently = updated.

1) a map reduce or spark job to filter out = non-empty rows
2) add some logging and do a custom build of = cassandra (ie in "removeDeletedCF" of ColumnFamilyStore) and grep log = files to get a report that can be consumed by your system. =  Potentially lotta redundant data and may be long after actual = expiration before shows up.
3) when you insert your TTLed = columns you also write to a wide row, with timestamp of expiration and = use that to drive the report... there will be a bit to that though and a = MR job still might be good idea for doing = processing.

---
Chris

On Aug 22, 2014, at 6:02 AM, Gaurav Bhatnagar <gbhatnagar@gmail.com> = wrote:

Hi,
    = I have stored following data structure in cassandra

RowKey: 119551747098


=3D> = (name=3Dc:per:@batchId, value=3Dad1, timestamp=3D1408345109805011, ttl=3D1436489)

=3D> (name=3Dc:per:@currency, value=3DUSD, = timestamp=3D1408345109805009, ttl=3D1436489)

=3D> = (name=3Dc:per:@decimalValue, value=3D2, timestamp=3D1408345109805003, ttl=3D1436489)


here Rowkey = 119551747098 is a numeric number containing serial number of data


These columns get expired when ttl value for that column is = reached.

I what to generate an audit trail = which contains value of RowKey along with column name and value when = they get deleted due to ttl expiration.


I want this audit = trail for reconciliation purpose so that I can know which all RowKeys = have got deleted from the system.


Is there any way in cassandra through which I can print value of RowKeys = which get deleted due TTL expiration.


Regards,

Gaurav


= --Apple-Mail=_FB457941-B252-4722-9126-48D5E25C1A51--