incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sankalp kohli <kohlisank...@gmail.com>
Subject Re: Updated sstable size for LCS, ran upgradesstables, file sizes didn't change
Date Sat, 22 Jun 2013 01:14:13 GMT
I think you can remove the json file which stores the mapping of which
sstable is in which level. This will be treated by cassandra as all
sstables in level 0 which will trigger a compaction. But if you have lot of
data, it will be very slow as you will keep compacting data between L1 and
L0.
This also happens when you write very fast and have a pile up in L0.  A
comment from the code will explain this what I am saying
// LevelDB gives each level a score of how much data it contains vs its
ideal amount, and
        // compacts the level with the highest score. But this falls apart
spectacularly once you
        // get behind.  Consider this set of levels:
        // L0: 988 [ideal: 4]
        // L1: 117 [ideal: 10]
        // L2: 12  [ideal: 100]
        //
        // The problem is that L0 has a much higher score (almost 250) than
L1 (11), so what we'll
        // do is compact a batch of MAX_COMPACTING_L0 sstables with all 117
L1 sstables, and put the
        // result (say, 120 sstables) in L1. Then we'll compact the next
batch of MAX_COMPACTING_L0,
        // and so forth.  So we spend most of our i/o rewriting the L1 data
with each batch.
        //
        // If we could just do *all* L0 a single time with L1, that would
be ideal.  But we can't
        // -- see the javadoc for MAX_COMPACTING_L0.
        //
        // LevelDB's way around this is to simply block writes if L0
compaction falls behind.
        // We don't have that luxury.
        //
        // So instead, we
        // 1) force compacting higher levels first, which minimizes the i/o
needed to compact
        //    optimially which gives us a long term win, and
        // 2) if L0 falls behind, we will size-tiered compact it to reduce
read overhead until
        //    we can catch up on the higher levels.
        //
        // This isn't a magic wand -- if you are consistently writing too
fast for LCS to keep
        // up, you're still screwed.  But if instead you have intermittent
bursts of activity,
        // it can help a lot.


On Fri, Jun 21, 2013 at 5:42 PM, Wei Zhu <wz1975@yahoo.com> wrote:

> I think the new SSTable will be in the new size. In order to do that, you
> need to trigger a compaction so that the new SSTables will be generated.
> for LCS, there is no major compaction though. You can run a nodetool repair
> and hopefully you will bring some new SSTables and compactions will kick in.
> Or you can change the $CFName.json file under your data directory and move
> every SSTable to level 0. You need to stop your node,  write a simple
> script to alter that file and start the node again.
>
> I think it will be helpful to have a nodetool command to change the
> SSTable Size and trigger the rebuild of the SSTables.
>
> Thanks.
> -Wei
>
> ------------------------------
> *From: *"Robert Coli" <rcoli@eventbrite.com>
> *To: *user@cassandra.apache.org
> *Sent: *Friday, June 21, 2013 4:51:29 PM
> *Subject: *Re: Updated sstable size for LCS, ran upgradesstables, file
> sizes didn't change
>
>
> On Fri, Jun 21, 2013 at 4:40 PM, Andrew Bialecki
> <andrew.bialecki@gmail.com> wrote:
> > However when we run alter the column
> > family and then run "nodetool upgradesstables -a keyspace columnfamily,"
> the
> > files in the data directory have been re-written, but the file sizes are
> the
> > same.
> >
> > Is this the expected behavior? If not, what's the right way to upgrade
> them.
> > If this is expected, how can we benchmark the read/write performance with
> > varying sstable sizes.
>
> It is expected, upgradesstables/scrub/clean compactions work on a
> single sstable at a time, they are not capable of combining or
> splitting them.
>
> In theory you could probably :
>
> 1) start out with the largest size you want to test
> 2) stop your node
> 3) use sstable_split [1] to split sstables
> 4) start node, test
> 5) repeat 2-4
>
> I am not sure if there is anything about level compaction which makes
> this infeasible.
>
> =Rob
> [1] https://github.com/pcmanus/cassandra/tree/sstable_split
>
>

Mime
View raw message