Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5ABAC9549 for ; Sun, 11 Nov 2012 14:14:02 +0000 (UTC) Received: (qmail 61972 invoked by uid 500); 11 Nov 2012 14:13:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 61818 invoked by uid 500); 11 Nov 2012 14:13:57 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 61773 invoked by uid 99); 11 Nov 2012 14:13:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Nov 2012 14:13:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sylvain@datastax.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Nov 2012 14:13:50 +0000 Received: by mail-vc0-f172.google.com with SMTP id fl11so5875703vcb.31 for ; Sun, 11 Nov 2012 06:13:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=OSDHhMQfbJ2MeXay1zrzRkx9eRXuxtjaDbxZp0XxdS4=; b=FpvxuYfIllPln7SFJe/ohTMpcuO9egborGa/fj0Zu9XKRYQppqxWN/9mXV5NNIXmfx IYnrjGvNPC7RxboYNirCr488gE6r2I5wH+x1+iSNSmTnX5GRaBheh8J4s69yoRlHph/2 vRHFtYgZBx98fJeUAYR1cP0oWLyMVuf7uhmLzJxK5a8neGKSbJMVwxVZqrW1mdWu5Xur 3DIfNP5J1Ksk3aH67uig2GGiWt8U2NEkNC86CgLsZ4McBsGeapl824GOgl2B0cyyCH9G +T9hhiR2eeKx1sbmY2ZyxOMJh4XkB2KDWgSLRnSFpIu0mKthK5HBJmxnLgTvLgCD37it KTSg== MIME-Version: 1.0 Received: by 10.52.70.83 with SMTP id k19mr12415160vdu.89.1352643207963; Sun, 11 Nov 2012 06:13:27 -0800 (PST) Received: by 10.58.249.135 with HTTP; Sun, 11 Nov 2012 06:13:27 -0800 (PST) In-Reply-To: References: Date: Sun, 11 Nov 2012 15:13:27 +0100 Message-ID: Subject: Re: leveled compaction and tombstoned data From: Sylvain Lebresne To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=20cf307d0078eb250e04ce38c932 X-Gm-Message-State: ALoCoQkbnxNfwqqMSPypi2DXCX5vTDs9H5YkeXscamm22XKWXODN1MVqucPw7dWUtrV24+mml/n+ X-Virus-Checked: Checked by ClamAV on apache.org --20cf307d0078eb250e04ce38c932 Content-Type: text/plain; charset=ISO-8859-1 On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo wrote: > No it does not exist. Rob and I might start a donation page and give > the money to whoever is willing to code it. If someone would write a > tool that would split an sstable into 4 smaller sstables (even an > offline command line tool) Something like that: https://github.com/pcmanus/cassandra/commits/sstable_split (adds an sstablesplit offline tool) > I would paypal them a hundo. > Just tell me how you want to proceed :) -- Sylvain > > On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner > wrote: > > Nope. I think at least once a week I hear someone suggest one way to > solve > > their problem is to "write an sstablesplit tool". > > > > I'm pretty sure that: > > > > Step 1. Write sstablesplit > > Step 2. ??? > > Step 3. Profit! > > > > > > > > On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ > wrote: > >> > >> @Rob Coli > >> > >> Does the "sstablesplit" function exists somewhere ? > >> > >> > >> > >> 2012/11/10 Jim Cistaro > >>> > >>> For some of our clusters, we have taken the periodic major compaction > >>> route. > >>> > >>> There are a few things to consider: > >>> 1) Once you start major compacting, depending on data size, you may be > >>> committed to doing it periodically because you create one big file that > >>> will take forever to naturally compact agaist 3 like sized files. > >>> 2) If you rely heavily on file cache (rather than large row caches), > each > >>> major compaction effectively invalidates the entire file cache beause > >>> everything is written to one new large file. > >>> > >>> -- > >>> Jim Cistaro > >>> > >>> On 11/9/12 11:27 AM, "Rob Coli" wrote: > >>> > >>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss > >>> > wrote: > >>> >> my question is would leveled compaction help to get rid of the > >>> >>tombstoned > >>> >> data faster than size tiered, and therefore reduce the disk space > >>> >> usage? > >>> > > >>> >You could also... > >>> > > >>> >1) run a major compaction > >>> >2) code up sstablesplit > >>> >3) profit! > >>> > > >>> >This method incurs a management penalty if not automated, but is > >>> >otherwise the most efficient way to deal with tombstones and obsolete > >>> >data.. :D > >>> > > >>> >=Rob > >>> > > >>> >-- > >>> >=Robert Coli > >>> >AIM>ALK - rcoli@palominodb.com > >>> >YAHOO - rcoli.palominob > >>> >SKYPE - rcoli_palominodb > >>> > > >>> > >> > > > > > > > > -- > > Aaron Turner > > http://synfin.net/ Twitter: @synfinatic > > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & > > Windows > > Those who would give up essential Liberty, to purchase a little temporary > > Safety, deserve neither Liberty nor Safety. > > -- Benjamin Franklin > > "carpe diem quam minimum credula postero" > > > --20cf307d0078eb250e04ce38c932 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo <edlinuxguru@gmail.co= m> wrote:
No it does not exist. Rob and I might start a donation page and give
the money to whoever is willing to code it. If someone would write a
tool that would split an sstable into 4 smaller sstables (even an
offline command line tool)

Something like t= hat:
https://github.com/pcmanus/cassandra/commits/sstable_split (= adds an sstablesplit offline tool)
=A0
I would paypal them a hundo.

Just tell me how you want to proceed :)

<= /div>
--
Sylvain
=A0

On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner <synfinatic@gmail.com> wrote:
> Nope. =A0I think at least once a week I hear someone suggest one way t= o solve
> their problem is to "write an sstablesplit tool".
>
> I'm pretty sure that:
>
> Step 1. Write sstablesplit
> Step 2. ???
> Step 3. Profit!
>
>
>
> On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
>>
>> @Rob Coli
>>
>> Does the "sstablesplit" function exists somewhere ?
>>
>>
>>
>> 2012/11/10 Jim Cistaro <jcistaro@netflix.com>
>>>
>>> For some of our clusters, we have taken the periodic major com= paction
>>> route.
>>>
>>> There are a few things to consider:
>>> 1) Once you start major compacting, depending on data size, yo= u may be
>>> committed to doing it periodically because you create one big = file that
>>> will take forever to naturally compact agaist 3 like sized fil= es.
>>> 2) If you rely heavily on file cache (rather than large row ca= ches), each
>>> major compaction effectively invalidates the entire file cache= beause
>>> everything is written to one new large file.
>>>
>>> --
>>> Jim Cistaro
>>>
>>> On 11/9/12 11:27 AM, "Rob Coli" <rcoli@palominodb.com> wrote:
>>>
>>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <btoddb@gmail.com>
>>> > wrote:
>>> >> my question is would leveled compaction help to get r= id of the
>>> >>tombstoned
>>> >> data faster than size tiered, and therefore reduce th= e disk space
>>> >> usage?
>>> >
>>> >You could also...
>>> >
>>> >1) run a major compaction
>>> >2) code up sstablesplit
>>> >3) profit!
>>> >
>>> >This method incurs a management penalty if not automated, = but is
>>> >otherwise the most efficient way to deal with tombstones a= nd obsolete
>>> >data.. :D
>>> >
>>> >=3DRob
>>> >
>>> >--
>>> >=3DRobert Coli
>>> >AIM&GTALK - rc= oli@palominodb.com
>>> >YAHOO - rcoli.palominob
>>> >SKYPE - rcoli_palominodb
>>> >
>>>
>>
>
>
>
> --
> Aaron Turner
> http://synfin.net/ =A0 =A0 =A0 =A0 Twitter: @synfinatic
>
http://tcpr= eplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little tempor= ary
> Safety, deserve neither Liberty nor Safety.
> =A0 =A0 -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>

--20cf307d0078eb250e04ce38c932--