incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike <mthero...@yahoo.com>
Subject Re: Size Tiered -> Leveled Compaction
Date Fri, 22 Feb 2013 17:56:41 GMT
Hello,

Still doing research before we potentially move one of our column 
families from Size Tiered->Leveled compaction this weekend.  I was doing 
some research around some of the bugs that were filed against leveled 
compaction in Cassandra and I found this:

https://issues.apache.org/jira/browse/CASSANDRA-4644

The bug mentions:

"You need to run the offline scrub (bin/sstablescrub) to fix the sstable 
overlapping problem from early 1.1 releases. (Running with -m to just 
check for overlaps between sstables should be fine, since you already 
scrubbed online which will catch out-of-order within an sstable.)"

We recently upgraded from 1.1.2 to 1.1.9.

Does anyone know if an offline scrub is recommended to be performed when 
switching from STCS->LCS after upgrading from 1.1.2?

Any insight would be appreciated,
Thanks,
-Mike

On 2/17/2013 8:57 PM, Wei Zhu wrote:
> We doubled the SStable size to 10M. It still generates a lot of SSTable and we don't
see much difference of the read latency.  We are able to finish the compactions after repair
within serveral hours. We will increase the SSTable size again if we feel the number of SSTable
hurts the performance.
>
> ----- Original Message -----
> From: "Mike" <mtheroux2@yahoo.com>
> To: user@cassandra.apache.org
> Sent: Sunday, February 17, 2013 4:50:40 AM
> Subject: Re: Size Tiered -> Leveled Compaction
>
>
> Hello Wei,
>
> First thanks for this response.
>
> Out of curiosity, what SSTable size did you choose for your usecase, and what made you
decide on that number?
>
> Thanks,
> -Mike
>
> On 2/14/2013 3:51 PM, Wei Zhu wrote:
>
>
>
>
> I haven't tried to switch compaction strategy. We started with LCS.
>
>
> For us, after massive data imports (5000 w/seconds for 6 days), the first repair is painful
since there is quite some data inconsistency. For 150G nodes, repair brought in about 30 G
and created thousands of pending compactions. It took almost a day to clear those. Just be
prepared LCS is really slow in 1.1.X. System performance degrades during that time since reads
could go to more SSTable, we see 20 SSTable lookup for one read.. (We tried everything we
can and couldn't speed it up. I think it's single threaded.... and it's not recommended to
turn on multithread compaction. We even tried that, it didn't help )There is parallel LCS
in 1.2 which is supposed to alleviate the pain. Haven't upgraded yet, hope it works:)
>
>
> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
>
>
>
>
>
> Since our cluster is not write intensive, only 100 w/seconds. I don't see any pending
compactions during regular operation.
>
>
> One thing worth mentioning is the size of the SSTable, default is 5M which is kind of
small for 200G (all in one CF) data set, and we are on SSD. It more than 150K files in one
directory. (200G/5M = 40K SSTable and each SSTable creates 4 files on disk) You might want
to watch that and decide the SSTable size.
>
>
> By the way, there is no concept of Major compaction for LCS. Just for fun, you can look
at a file called $CFName.json in your data directory and it tells you the SSTable distribution
among different levels.
>
>
> -Wei
>
>
>
>
>
> From: Charles Brophy <cbrophy@zulily.com>
> To: user@cassandra.apache.org
> Sent: Thursday, February 14, 2013 8:29 AM
> Subject: Re: Size Tiered -> Leveled Compaction
>
>
> I second these questions: we've been looking into changing some of our CFs to use leveled
compaction as well. If anybody here has the wisdom to answer them it would be of wonderful
help.
>
>
> Thanks
> Charles
>
>
> On Wed, Feb 13, 2013 at 7:50 AM, Mike < mtheroux2@yahoo.com > wrote:
>
>
> Hello,
>
> I'm investigating the transition of some of our column families from Size Tiered ->
Leveled Compaction. I believe we have some high-read-load column families that would benefit
tremendously.
>
> I've stood up a test DB Node to investigate the transition. I successfully alter the
column family, and I immediately noticed a large number (1000+) pending compaction tasks become
available, but no compaction get executed.
>
> I tried running "nodetool sstableupgrade" on the column family, and the compaction tasks
don't move.
>
> I also notice no changes to the size and distribution of the existing SSTables.
>
> I then run a major compaction on the column family. All pending compaction tasks get
run, and the SSTables have a distribution that I would expect from LeveledCompaction (lots
and lots of 10MB files).
>
> Couple of questions:
>
> 1) Is a major compaction required to transition from size-tiered to leveled compaction?
> 2) Are major compactions as much of a concern for LeveledCompaction as their are for
Size Tiered?
>
> All the documentation I found concerning transitioning from Size Tiered to Level compaction
discuss the alter table cql command, but I haven't found too much on what else needs to be
done after the schema change.
>
> I did these tests with Cassandra 1.1.9.
>
> Thanks,
> -Mike
>
>
>
>
>


Mime
View raw message