cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike <>
Subject Re: Size Tiered -> Leveled Compaction
Date Sat, 16 Feb 2013 17:26:25 GMT
Another piece of information that would be useful is advice on how to 
properly set the SSTable size for your usecase.  I understand the 
default is 5MB, a lot of examples show the use of 10MB, and I've seen 
cases where people have set is as high as 200MB.

Any information is appreciated,

On 2/14/2013 4:10 PM, Michael Theroux wrote:
> BTW, when I say "major compaction", I mean running the "nodetool 
> compact" command (which does a major compaction for Sized Tiered 
> Compaction).  I didn't see the distribution of SSTables I expected 
> until I ran that command, in the steps I described below.
> -Mike
> On Feb 14, 2013, at 3:51 PM, Wei Zhu wrote:
>> I haven't tried to switch compaction strategy. We started with LCS.
>> For us, after massive data imports (5000 w/seconds for 6 days), the 
>> first repair is painful since there is quite some data inconsistency. 
>> For 150G nodes, repair brought in about 30 G and created thousands of 
>> pending compactions. It took almost a day to clear those. Just be 
>> prepared LCS is really slow in 1.1.X. System performance degrades 
>> during that time since reads could go to more SSTable, we see 20 
>> SSTable lookup for one read.. (We tried everything we can and 
>> couldn't speed it up. I think it's single threaded.... and it's not 
>> recommended to turn on multithread compaction. We even tried that, it 
>> didn't help )There is parallel LCS in 1.2 which is supposed to 
>> alleviate the pain. Haven't upgraded yet, hope it works:)
>> Since our cluster is not write intensive, only 100 w/seconds. I don't 
>> see any pending compactions during regular operation.
>> One thing worth mentioning is the size of the SSTable, default is 5M 
>> which is kind of small for 200G (all in one CF) data set, and we are 
>> on SSD.  It more than  150K files in one directory. (200G/5M = 40K 
>> SSTable and each SSTable creates 4 files on disk)  You might want to 
>> watch that and decide the SSTable size.
>> By the way, there is no concept of Major compaction for LCS. Just for 
>> fun, you can look at a file called $CFName.json in your data 
>> directory and it tells you the SSTable distribution among different 
>> levels.
>> -Wei
>> ------------------------------------------------------------------------
>> *From:* Charles Brophy < <>>
>> *To:* <>
>> *Sent:* Thursday, February 14, 2013 8:29 AM
>> *Subject:* Re: Size Tiered -> Leveled Compaction
>> I second these questions: we've been looking into changing some of 
>> our CFs to use leveled compaction as well. If anybody here has the 
>> wisdom to answer them it would be of wonderful help.
>> Thanks
>> Charles
>> On Wed, Feb 13, 2013 at 7:50 AM, Mike < 
>> <>> wrote:
>>     Hello,
>>     I'm investigating the transition of some of our column families
>>     from Size Tiered -> Leveled Compaction.  I believe we have some
>>     high-read-load column families that would benefit tremendously.
>>     I've stood up a test DB Node to investigate the transition.  I
>>     successfully alter the column family, and I immediately noticed a
>>     large number (1000+) pending compaction tasks become available,
>>     but no compaction get executed.
>>     I tried running "nodetool sstableupgrade" on the column family,
>>     and the compaction tasks don't move.
>>     I also notice no changes to the size and distribution of the
>>     existing SSTables.
>>     I then run a major compaction on the column family.  All pending
>>     compaction tasks get run, and the SSTables have a distribution
>>     that I would expect from LeveledCompaction (lots and lots of 10MB
>>     files).
>>     Couple of questions:
>>     1) Is a major compaction required to transition from size-tiered
>>     to leveled compaction?
>>     2) Are major compactions as much of a concern for
>>     LeveledCompaction as their are for Size Tiered?
>>     All the documentation I found concerning transitioning from Size
>>     Tiered to Level compaction discuss the alter table cql command,
>>     but I haven't found too much on what else needs to be done after
>>     the schema change.
>>     I did these tests with Cassandra 1.1.9.
>>     Thanks,
>>     -Mike

View raw message