incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: leveled compaction and tombstoned data
Date Sun, 11 Nov 2012 14:13:27 GMT
On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> No it does not exist. Rob and I might start a donation page and give
> the money to whoever is willing to code it. If someone would write a
> tool that would split an sstable into 4 smaller sstables (even an
> offline command line tool)


Something like that:
https://github.com/pcmanus/cassandra/commits/sstable_split (adds an
sstablesplit offline tool)


> I would paypal them a hundo.
>

Just tell me how you want to proceed :)

--
Sylvain


>
> On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner <synfinatic@gmail.com>
> wrote:
> > Nope.  I think at least once a week I hear someone suggest one way to
> solve
> > their problem is to "write an sstablesplit tool".
> >
> > I'm pretty sure that:
> >
> > Step 1. Write sstablesplit
> > Step 2. ???
> > Step 3. Profit!
> >
> >
> >
> > On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ <arodrime@gmail.com>
> wrote:
> >>
> >> @Rob Coli
> >>
> >> Does the "sstablesplit" function exists somewhere ?
> >>
> >>
> >>
> >> 2012/11/10 Jim Cistaro <jcistaro@netflix.com>
> >>>
> >>> For some of our clusters, we have taken the periodic major compaction
> >>> route.
> >>>
> >>> There are a few things to consider:
> >>> 1) Once you start major compacting, depending on data size, you may be
> >>> committed to doing it periodically because you create one big file that
> >>> will take forever to naturally compact agaist 3 like sized files.
> >>> 2) If you rely heavily on file cache (rather than large row caches),
> each
> >>> major compaction effectively invalidates the entire file cache beause
> >>> everything is written to one new large file.
> >>>
> >>> --
> >>> Jim Cistaro
> >>>
> >>> On 11/9/12 11:27 AM, "Rob Coli" <rcoli@palominodb.com> wrote:
> >>>
> >>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <btoddb@gmail.com>
> >>> > wrote:
> >>> >> my question is would leveled compaction help to get rid of the
> >>> >>tombstoned
> >>> >> data faster than size tiered, and therefore reduce the disk space
> >>> >> usage?
> >>> >
> >>> >You could also...
> >>> >
> >>> >1) run a major compaction
> >>> >2) code up sstablesplit
> >>> >3) profit!
> >>> >
> >>> >This method incurs a management penalty if not automated, but is
> >>> >otherwise the most efficient way to deal with tombstones and obsolete
> >>> >data.. :D
> >>> >
> >>> >=Rob
> >>> >
> >>> >--
> >>> >=Robert Coli
> >>> >AIM&GTALK - rcoli@palominodb.com
> >>> >YAHOO - rcoli.palominob
> >>> >SKYPE - rcoli_palominodb
> >>> >
> >>>
> >>
> >
> >
> >
> > --
> > Aaron Turner
> > http://synfin.net/         Twitter: @synfinatic
> > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> > Windows
> > Those who would give up essential Liberty, to purchase a little temporary
> > Safety, deserve neither Liberty nor Safety.
> >     -- Benjamin Franklin
> > "carpe diem quam minimum credula postero"
> >
>

Mime
View raw message