Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C60211A67 for ; Fri, 16 May 2014 23:06:21 +0000 (UTC) Received: (qmail 51529 invoked by uid 500); 16 May 2014 22:51:26 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 18830 invoked by uid 500); 16 May 2014 22:38:34 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 12218 invoked by uid 99); 16 May 2014 22:32:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 22:32:33 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_IMAGE_ONLY_28,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of burtonator2011@gmail.com designates 209.85.215.51 as permitted sender) Received: from [209.85.215.51] (HELO mail-la0-f51.google.com) (209.85.215.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 22:32:29 +0000 Received: by mail-la0-f51.google.com with SMTP id gf5so2418261lab.38 for ; Fri, 16 May 2014 15:32:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=+UFpyLBx4D97ItIYEdPnXtnjU+DDph+GHIbFdHiugUk=; b=USIwe9xt5rgty0ApNJNsT38q9UQY3VXvSmEBoA/ipFVZSDEAm9y2LhkFx+TUtU5Hrj tU4KhpHzmbTjF9O/ATWad4KNuqppQebUBEt6aWLB4dAneD8lmzuG2OI6aosmXDQNjVuy GvxZgM81lB6BUsqIppQzgI26KAdKBZ+DDZAs5OG2VmjxgjCC6cKBRn+oI5+SRc1oIZ89 rvZXLH23gPPbXpi0ZvVDnfIQymJTG/Ua/ga72jf9xt/i3diyrZkcRG2ubpzyWQuTQn4U C6Wv0wZRCtnMiu+HLhnu5/HlOXPpWJT9dOoUYbMr9tT7tDIsQouv8S++V/yw3LsZIT17 AWFg== X-Received: by 10.152.36.134 with SMTP id q6mr14090706laj.29.1400279527426; Fri, 16 May 2014 15:32:07 -0700 (PDT) MIME-Version: 1.0 Sender: burtonator2011@gmail.com Received: by 10.112.219.231 with HTTP; Fri, 16 May 2014 15:31:47 -0700 (PDT) In-Reply-To: References: From: Kevin Burton Date: Fri, 16 May 2014 15:31:47 -0700 X-Google-Sender-Auth: oRZVgomNJYPIQvxZVBRpQmXjOyI Message-ID: Subject: Re: Storing log structured data in Cassandra without compactions for performance boost. To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b5d91efd18f1c04f98bfb8c X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5d91efd18f1c04f98bfb8c Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > If the data is read from a slice of a partition that has been added over > time there will be a part of that row in every almost sstable. That would > mean all of them (multiple disk seeks depending on clustering order per > sstable) would have to be read from in order to service the query. Data > model can help or hurt a lot though. > > Yes=E2=80=A6 totally agree, but we wouldn't do that. The entire 'row' is i= mmutable and passes through the system and then expires due to TTL. TTL is probably the way to go here, especially if Cassandra just drops the whole SSTable on the TTL expiration which is what I think I"m hearing. > If you set the TTL for the columns you added then C* will clean up > sstables (if size tiered and post 1.2) once the datas been expired. Sinc= e > you never delete set the gc_grace_seconds to 0 so the ttl expiration does= nt > result in tombstones. > > Thanks! Kevin --=20 Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com =E2=80=A6 or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. --047d7b5d91efd18f1c04f98bfb8c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

If the data is read from a slice of a partition that has been added= over time there will be a part of that row in every almost sstable. That w= ould mean all of them (multiple disk seeks depending on clustering order pe= r sstable) would have to be read from in order to service the query. =C2=A0= Data model can help or hurt a lot though.


Yes=E2=80=A6 totally = agree, but we wouldn't do that. =C2=A0The entire 'row' is immut= able and passes through the system and then expires due to TTL. =C2=A0

TTL is probably the way to go here, especially if Cassandra just drops= the whole SSTable on the TTL expiration which is what I think I"m hea= ring.
=C2=A0
If you set the TTL for = the columns you added then C* will clean up sstables (if size tiered and po= st 1.2) once the datas been expired. =C2=A0Since you never delete set the g= c_grace_seconds to 0 so the ttl expiration doesnt result in tombstones.


Thanks!
Kevin=C2=A0
--

Founder/C= EO=C2=A0Spinn3r.com
Location:=C2=A0San Francisco, CA
Skype:=C2=A0burton= ator
=E2=80=A6 or check out my Google+ profile
War is peace. Fre= edom is slavery. Ignorance is strength. Corporations are people.

--047d7b5d91efd18f1c04f98bfb8c--