incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Theroux <mthero...@yahoo.com>
Subject Re: Size Tiered -> Leveled Compaction
Date Mon, 25 Feb 2013 02:45:31 GMT
Aaron,

Thanks for the response.  I think I speak for many Cassandra users when I say we greatly appreciate
your help with our questions and issues.  For the specific bug I mentioned, I found this comment
:

	http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released

	"Automatic fixing of overlapping leveled sstables (CASSANDRA-4644)"

Although I had difficulty putting 2 and 2 together from the comments in 4644 (it mentioned
being fixed in 1.1.6, but also being not reproducible).

We converted two column families yesterday (two we believe would be particularly well suited
for Leveled Compaction).  We have two more to convert, but those will wait until next weekend.
 So far no issues, and, we've seen some positive results.

To help answer some of my own questions I posed in this thread, and others have expressed
interest in knowing, the steps we followed were:

1) Perform the proper alter table command:

	ALTER TABLE X WITH compaction_strategy_class='LeveledCompactionStrategy' AND  compaction_strategy_options:sstable_size_in_mb=10;

2) Ran compact on all nodes

	nodetool compact <keyspace> X 

We converted one column family at a time, and temporarily disabled some maintenance activities
we perform to decrease load while we converted column families, as the compaction was resource
heavy and I didn't wish to interfere with our operational activities as much as possible.
   In our case, the compaction after altering the schema, took about an hour and a half.

Thus far, it appears everything worked without a hitch.  I chose 10 mb for the SSTABLE size,
based on Wei's feedback (who's data size is on-par with ours), and other tid-bits I found
through searching.  Based on issues people have reported in the relatively distant past. I
made sure that we've been handling the compaction load properly, and I've run test repairs
on the specific tables we converted.  We also tested restarting a node after the conversion.

Again, I believe the tables we converted were particularly well suited for Leveled Compaction.
 These particular column families were situations where reads outstripped writes by an order
of magnitude or two.

So far, our results have been very positive.  We've seen a greater than 50% reduction in read
I/O, and a large improvement in performance for some activities.  We've also seen an improvement
in memory utilization.  I imagine other's mileage may vary.

If everything is stable over the next week, we will convert the last two tables we are considering
for Leveled Compaction.

Thanks again!
-Mike

On Feb 24, 2013, at 8:56 PM, aaron morton wrote:

> If you did not use LCS until after the upgrade to 1.1.9 I think you are ok. 
> 
> If in doubt the steps here look like they helped https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommentId=13456137&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456137
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 23/02/2013, at 6:56 AM, Mike <mtheroux2@yahoo.com> wrote:
> 
>> Hello,
>> 
>> Still doing research before we potentially move one of our column families from Size
Tiered->Leveled compaction this weekend.  I was doing some research around some of the
bugs that were filed against leveled compaction in Cassandra and I found this:
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-4644
>> 
>> The bug mentions:
>> 
>> "You need to run the offline scrub (bin/sstablescrub) to fix the sstable overlapping
problem from early 1.1 releases. (Running with -m to just check for overlaps between sstables
should be fine, since you already scrubbed online which will catch out-of-order within an
sstable.)"
>> 
>> We recently upgraded from 1.1.2 to 1.1.9.
>> 
>> Does anyone know if an offline scrub is recommended to be performed when switching
from STCS->LCS after upgrading from 1.1.2?
>> 
>> Any insight would be appreciated,
>> Thanks,
>> -Mike
>> 
>> On 2/17/2013 8:57 PM, Wei Zhu wrote:
>>> We doubled the SStable size to 10M. It still generates a lot of SSTable and we
don't see much difference of the read latency.  We are able to finish the compactions after
repair within serveral hours. We will increase the SSTable size again if we feel the number
of SSTable hurts the performance.
>>> 
>>> ----- Original Message -----
>>> From: "Mike" <mtheroux2@yahoo.com>
>>> To: user@cassandra.apache.org
>>> Sent: Sunday, February 17, 2013 4:50:40 AM
>>> Subject: Re: Size Tiered -> Leveled Compaction
>>> 
>>> 
>>> Hello Wei,
>>> 
>>> First thanks for this response.
>>> 
>>> Out of curiosity, what SSTable size did you choose for your usecase, and what
made you decide on that number?
>>> 
>>> Thanks,
>>> -Mike
>>> 
>>> On 2/14/2013 3:51 PM, Wei Zhu wrote:
>>> 
>>> 
>>> 
>>> 
>>> I haven't tried to switch compaction strategy. We started with LCS.
>>> 
>>> 
>>> For us, after massive data imports (5000 w/seconds for 6 days), the first repair
is painful since there is quite some data inconsistency. For 150G nodes, repair brought in
about 30 G and created thousands of pending compactions. It took almost a day to clear those.
Just be prepared LCS is really slow in 1.1.X. System performance degrades during that time
since reads could go to more SSTable, we see 20 SSTable lookup for one read.. (We tried everything
we can and couldn't speed it up. I think it's single threaded.... and it's not recommended
to turn on multithread compaction. We even tried that, it didn't help )There is parallel LCS
in 1.2 which is supposed to alleviate the pain. Haven't upgraded yet, hope it works:)
>>> 
>>> 
>>> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Since our cluster is not write intensive, only 100 w/seconds. I don't see any
pending compactions during regular operation.
>>> 
>>> 
>>> One thing worth mentioning is the size of the SSTable, default is 5M which is
kind of small for 200G (all in one CF) data set, and we are on SSD. It more than 150K files
in one directory. (200G/5M = 40K SSTable and each SSTable creates 4 files on disk) You might
want to watch that and decide the SSTable size.
>>> 
>>> 
>>> By the way, there is no concept of Major compaction for LCS. Just for fun, you
can look at a file called $CFName.json in your data directory and it tells you the SSTable
distribution among different levels.
>>> 
>>> 
>>> -Wei
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: Charles Brophy <cbrophy@zulily.com>
>>> To: user@cassandra.apache.org
>>> Sent: Thursday, February 14, 2013 8:29 AM
>>> Subject: Re: Size Tiered -> Leveled Compaction
>>> 
>>> 
>>> I second these questions: we've been looking into changing some of our CFs to
use leveled compaction as well. If anybody here has the wisdom to answer them it would be
of wonderful help.
>>> 
>>> 
>>> Thanks
>>> Charles
>>> 
>>> 
>>> On Wed, Feb 13, 2013 at 7:50 AM, Mike < mtheroux2@yahoo.com > wrote:
>>> 
>>> 
>>> Hello,
>>> 
>>> I'm investigating the transition of some of our column families from Size Tiered
-> Leveled Compaction. I believe we have some high-read-load column families that would
benefit tremendously.
>>> 
>>> I've stood up a test DB Node to investigate the transition. I successfully alter
the column family, and I immediately noticed a large number (1000+) pending compaction tasks
become available, but no compaction get executed.
>>> 
>>> I tried running "nodetool sstableupgrade" on the column family, and the compaction
tasks don't move.
>>> 
>>> I also notice no changes to the size and distribution of the existing SSTables.
>>> 
>>> I then run a major compaction on the column family. All pending compaction tasks
get run, and the SSTables have a distribution that I would expect from LeveledCompaction (lots
and lots of 10MB files).
>>> 
>>> Couple of questions:
>>> 
>>> 1) Is a major compaction required to transition from size-tiered to leveled compaction?
>>> 2) Are major compactions as much of a concern for LeveledCompaction as their
are for Size Tiered?
>>> 
>>> All the documentation I found concerning transitioning from Size Tiered to Level
compaction discuss the alter table cql command, but I haven't found too much on what else
needs to be done after the schema change.
>>> 
>>> I did these tests with Cassandra 1.1.9.
>>> 
>>> Thanks,
>>> -Mike
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 


Mime
View raw message