Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A60AC11E29 for ; Thu, 15 May 2014 05:01:53 +0000 (UTC) Received: (qmail 34699 invoked by uid 500); 14 May 2014 22:31:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 34527 invoked by uid 500); 14 May 2014 22:31:00 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 34381 invoked by uid 99); 14 May 2014 22:31:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 May 2014 22:31:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.220.53] (HELO mail-pa0-f53.google.com) (209.85.220.53) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 May 2014 22:30:56 +0000 Received: by mail-pa0-f53.google.com with SMTP id kp14so178455pab.26 for ; Wed, 14 May 2014 15:30:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=oPzrc4l209klSSo/16klJAH08ti4pNit/KR82p2aVQM=; b=LcPBDOvaS0HDGSmJi/lYHatARjRg7q4cZuHyCJG3MhUSXNxabBVlal2cpPYsGQRNGn DeIKizONjeqJHgDKUtg5uomGmGZItnjBllqujx23NngdyJloHXZBlprR8xlx2++kSGuy eIaCPVuOMFDHQszZW/LkeuIkTaTuox0S39n/RlptZ0/BcnsYocZekUSvKRXbjOcH7v+F GkdggscIIstz0eFiY27k8TknXSxHxokytGl/Fzqrr8tb9PuT3KuUcAhgzMuez26ABZM8 ADriamX50FbP428qC3ytdaGuuyC7AE4AO+nyORhBQ8+IYHrXe2gEqeJ2BaDTWaB9X1Hd dONw== X-Gm-Message-State: ALoCoQnskVsdXMDGFeMEil6Cwjm31j0dbllAu73IjX3zmYET7GDHQM63YiCsTAIka+GxY25H+B3i X-Received: by 10.66.230.193 with SMTP id ta1mr7818312pac.29.1400106632221; Wed, 14 May 2014 15:30:32 -0700 (PDT) Received: from [172.16.1.9] ([203.86.207.101]) by mx.google.com with ESMTPSA id xo9sm12830583pab.18.2014.05.14.15.30.30 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 14 May 2014 15:30:31 -0700 (PDT) From: Aaron Morton Content-Type: multipart/alternative; boundary="Apple-Mail=_7F177C21-3664-41C2-B2D8-7FD31201B5EE" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: Storing log structured data in Cassandra without compactions for performance boost. Date: Thu, 15 May 2014 10:30:27 +1200 References: To: Cassandra User In-Reply-To: X-Mailer: Apple Mail (2.1874) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_7F177C21-3664-41C2-B2D8-7FD31201B5EE Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 If you disable compaction you will end up with a *lot* of sstables, this = will hurt read performance and be a pain to manage (including making = repairs and bootstrapping taking longer) STCS is not too onerous, I=92d recommend leaving on. If you want it to = run less frequently increase min_threshold.=20 Cheers Aaron ----------------- Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 8/05/2014, at 8:36 am, DuyHai Doan wrote: > Hello Kevin >=20 > You can disable compaction by configuring the compaction options of = your table as follow: >=20 > compaction=3D{'min_threshold': '0', 'class': = 'SizeTieredCompactionStrategy', 'max_threshold': '0'} >=20 > Regards >=20 > Duy Hai DOAN >=20 >=20 > On Wed, May 7, 2014 at 2:55 AM, Kevin Burton = wrote: > I'm looking at storing log data in Cassandra=85=20 >=20 > Every record is a unique timestamp for the key, and then the log line = for the value. >=20 > I think it would be best to just disable compactions. >=20 > - there will never be any deletes. >=20 > - all the data will be accessed in time range (probably partitioned = randomly) and sequentially. >=20 > So every time a memtable flushes, we will just keep that SSTable = forever. =20 >=20 > Compacting the data is kind of redundant in this situation. >=20 > I was thinking the best strategy is to use setcompactionthreshold and = set the value VERY high to compactions are never triggered. >=20 > Also, It would be IDEAL to be able to tell cassandra to just drop a = full SSTable so that I can truncate older data without having to do a = major compaction and without having to mark everything with a tombstone. = Is this possible? >=20 >=20 >=20 > --=20 >=20 > Founder/CEO Spinn3r.com > Location: San Francisco, CA > Skype: burtonator > blog: http://burtonator.wordpress.com > =85 or check out my Google+ profile >=20 > War is peace. Freedom is slavery. Ignorance is strength. Corporations = are people. >=20 >=20 --Apple-Mail=_7F177C21-3664-41C2-B2D8-7FD31201B5EE Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 If you = disable compaction you will end up with a *lot* of sstables, this will = hurt read performance and be a pain to manage (including making repairs = and bootstrapping taking longer)

STCS is not too = onerous, I=92d recommend leaving on. If you want it to run less = frequently increase = min_threshold. 

Cheers
Aaron<= /div>

-----------------
Aaron = Morton
New = Zealand
@aaronmorton

Co-Founder & = Principal Consultant
Apache Cassandra Consulting
<= /div>

On 8/05/2014, at 8:36 am, DuyHai Doan <doanduyhai@gmail.com> = wrote:

Hello = Kevin

 You can disable compaction by = configuring the compaction options of your table as = follow:

  = compaction=3D{'min_threshold': '0', 'class': = 'SizeTieredCompactionStrategy', 'max_threshold': '0'}

Regards

 Duy Hai = DOAN


On Wed, May 7, 2014 at 2:55 AM, Kevin Burton <burton@spinn3r.com> wrote:
I'm looking at storing log data in = Cassandra=85 

Every record is a unique timestamp = for the key, and then the log line for the value.

I think it would be best to just disable = compactions.

- there will never be any = deletes.

- all the data will be accessed in = time range (probably partitioned randomly) and = sequentially.

So every time a memtable flushes, = we will just keep that SSTable forever.  

Compacting the data is kind of redundant in this = situation.

I was thinking the best strategy is = to use setcompactionthreshold and set the value VERY high to compactions = are never triggered.

Also, It would be IDEAL to be able to tell cassandra = to just drop a full SSTable so that I can truncate older data without = having to do a major compaction and without having to mark everything = with a tombstone.  Is this possible?



--

Founder/CEO Spinn3r.com
Location: San Francisco, = CA
Skype: burtonator
=85 or check out my Google+ profile
War is peace. Freedom is slavery. Ignorance = is strength. Corporations are people.



= --Apple-Mail=_7F177C21-3664-41C2-B2D8-7FD31201B5EE--