Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5168010484 for ; Thu, 25 Jul 2013 05:15:11 +0000 (UTC) Received: (qmail 54797 invoked by uid 500); 25 Jul 2013 05:15:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 54774 invoked by uid 500); 25 Jul 2013 05:15:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 54766 invoked by uid 99); 25 Jul 2013 05:15:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Jul 2013 05:15:08 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a81.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Jul 2013 05:15:03 +0000 Received: from homiemail-a81.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTP id 8BE72A806C for ; Wed, 24 Jul 2013 22:14:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=fGZy92q0EMC2EEW1Kk3rsrZ/5S M=; b=utOFUzR1h5rGsbtvEAwBB88YaoemyJvk5YNbAnwGTGslB0y1P7QsrfKBCO U7JR9yBEU21TFTZQe92YHgsEZEeM/5OwVDVzec1720IULGC8fzmkjBzs7QKMQjQh Yl7E0xjsR6K4HFYkfuZqBQhjQLOzEW3OvX+faBa4zRo12JPd4= Received: from [172.16.1.7] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTPSA id EBF7AA8025 for ; Wed, 24 Jul 2013 22:14:42 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_2D64E707-88A4-4AEE-A26B-AF987BAFF1A0" Message-Id: <1F25CDAE-B299-4D97-A7AB-2FDBE530FE53@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: high write load, with lots of updates, considerations? tomestombed data coming back to life Date: Thu, 25 Jul 2013 17:14:41 +1200 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_2D64E707-88A4-4AEE-A26B-AF987BAFF1A0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 > I was watching some videos from the C* summit 2013 and I recall many = people saying that if you can some up with a design where you don't = preform updates on rows, that would make things easier (I believe it was = because there would be less compaction). No entirely true.=20 There will always be compaction. But if you do updates there are = overwrites which means there is data on disk that is irrelevant and is = not released until compaction get's to those files.=20 > Could old tomestombed data somehow come back to life? I forget what = scenerio brings about old data (kinda scary!). If you don't run repair on every node every gc_grace_seconds there is a = chance of it happening.=20 Cheers ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 24/07/2013, at 4:22 AM, S Ahmed wrote: > I was watching some videos from the C* summit 2013 and I recall many = people saying that if you can some up with a design where you don't = preform updates on rows, that would make things easier (I believe it was = because there would be less compaction). >=20 > When building an Analytics (time series) app on top of C*, based on = Twitters Rainbird design = (http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitte= r-strata-2011), this means there will be lots and lots of counters. >=20 > With lots of counters (updates), admin wise, what are some things to = consider? >=20 > Could old tomestombed data somehow come back to life? I forget what = scenerio brings about old data (kinda scary!). --Apple-Mail=_2D64E707-88A4-4AEE-A26B-AF987BAFF1A0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1
I was watching some videos = from the C* summit 2013 and I recall many people saying that if you can = some up with a design where you don't preform updates on rows, that = would make things easier (I believe it was because there would be less = compaction).
No entirely = true. 

There will always be compaction. But if = you do updates there are overwrites which means there is data on disk = that is irrelevant and is not released until compaction get's to those = files. 

Could old tomestombed data somehow come back to life? =  I forget what scenerio brings about old data (kinda = scary!).
If you don't run repair = on every node every gc_grace_seconds there is a chance of it = happening. 

Cheers

http://www.thelastpickle.com

On 24/07/2013, at 4:22 AM, S Ahmed <sahmed1020@gmail.com> = wrote:

I was watching some videos from the C* = summit 2013 and I recall many people saying that if you can some up with = a design where you don't preform updates on rows, that would make things = easier (I believe it was because there would be less compaction).

When building an Analytics (time series) app on top of = C*, based on Twitters Rainbird design (http://www.slideshare.net/kevinweil/rainbird-realtim= e-analytics-at-twitter-strata-2011), this means there will be lots = and lots of counters.

With lots of counters (updates), admin wise, what = are some things to consider?

Could old = tomestombed data somehow come back to life?  I forget what scenerio = brings about old data (kinda scary!).

= --Apple-Mail=_2D64E707-88A4-4AEE-A26B-AF987BAFF1A0--