cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wright <>
Subject Re: sstable size change
Date Tue, 23 Jul 2013 13:48:48 GMT
Can you elaborate on what you mean by "let it take its own course organically"?  Will Cassandra
force any newly compacted files to my new setting as compactions are naturally triggered?

From: sankalp kohli <<>>
Reply-To: "<>" <<>>
Date: Monday, July 22, 2013 4:48 PM
To: "<>" <<>>
Subject: Re: sstable size change

You can remove the json file and that will be treated as all sstables are now in L0. Since
you have lot of data, the compaction will take a very long time. See the comment below directly
from Cassandra code. If you chose to do this, you might want to increase the rate of compaction
by usual means. If you are on spinning, then it might be a very big problem.
During the time of compaction, the read performance will be impacted.

Unless there is a very urgent need to change the sstable size, I would change the size and
let it take it own course organically.

// LevelDB gives each level a score of how much data it contains vs its ideal amount, and
        // compacts the level with the highest score. But this falls apart spectacularly once
        // get behind.  Consider this set of levels:
        // L0: 988 [ideal: 4]
        // L1: 117 [ideal: 10]
        // L2: 12  [ideal: 100]
        // The problem is that L0 has a much higher score (almost 250) than L1 (11), so what
        // do is compact a batch of MAX_COMPACTING_L0 sstables with all 117 L1 sstables, and
put the
        // result (say, 120 sstables) in L1. Then we'll compact the next batch of MAX_COMPACTING_L0,
        // and so forth.  So we spend most of our i/o rewriting the L1 data with each batch.
        // If we could just do *all* L0 a single time with L1, that would be ideal.  But we
        // -- see the javadoc for MAX_COMPACTING_L0.
        // LevelDB's way around this is to simply block writes if L0 compaction falls behind.
        // We don't have that luxury.
        // So instead, we
        // 1) force compacting higher levels first, which minimizes the i/o needed to compact
        //    optimially which gives us a long term win, and
        // 2) if L0 falls behind, we will size-tiered compact it to reduce read overhead until
        //    we can catch up on the higher levels.
        // This isn't a magic wand -- if you are consistently writing too fast for LCS to
        // up, you're still screwed.  But if instead you have intermittent bursts of activity,
        // it can help a lot.

On Mon, Jul 22, 2013 at 12:51 PM, Andrew Bialecki <<>>
My understanding is deleting the .json metadata file is the only way currently. If you search
the user list archives, there are folks who are building tools to force compaction and rebuild
sstables with the new size. I believe there's been a bit of talk of potentially including
those tools as a pat of a future release.

Also, to answer your question about bloom filters, those are handled differently and if you
run upgradesstables after altering the BF FP ratio, that will rebuild the BFs for each sstable.

On Mon, Jul 22, 2013 at 2:49 PM, Janne Jalkanen <<>>

I don't think upgradesstables is enough, since it's more of a "change this file to a new format
but don't try to merge sstables and compact" -thing.

Deleting the .json -file is probably the only way, but someone more familiar with cassandra
LCS might be able to tell whether manually editing the json file so that you drop all sstables
a level might work? Since they would overflow the new level, they would compact soon, but
the impact might be less drastic than just deleting the .json file (which takes everything
to L0)...


On 22 Jul 2013, at 16:02, Keith Wright <<>>

Hi all,

   I know there has been several threads recently on this but I wanted to make sure I got
a clear answer:  we are looking to increase our SSTable size for a couple of our LCS tables
as well as chunk size (to match the SSD block size).   The largest table is at 500 GB across
6 nodes (RF 3, C* 1.2.4 VNodes).  I wanted to get feedback on the best way to make this change
with minimal load impact on the cluster.  After I make the change, I understand that I need
to force the nodes to re-compact the tables.

Can this be done via upgrade sstables or do I need to shutdown the node, delete the .json
file, and restart as some have suggested?

I assume I can do this one node at a time?

If I change the bloom filter size, I assume I will need to force compaction again?  Using
the same methodology?

Thank you

View raw message