Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A50A1DD8A for ; Mon, 25 Feb 2013 02:46:04 +0000 (UTC) Received: (qmail 99803 invoked by uid 500); 25 Feb 2013 02:46:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 99500 invoked by uid 500); 25 Feb 2013 02:46:00 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 99474 invoked by uid 99); 25 Feb 2013 02:46:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2013 02:46:00 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.139.212.180] (HELO nm21.bullet.mail.bf1.yahoo.com) (98.139.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2013 02:45:54 +0000 Received: from [98.139.215.142] by nm21.bullet.mail.bf1.yahoo.com with NNFMP; 25 Feb 2013 02:45:32 -0000 Received: from [98.139.213.7] by tm13.bullet.mail.bf1.yahoo.com with NNFMP; 25 Feb 2013 02:45:32 -0000 Received: from [127.0.0.1] by smtp107.mail.bf1.yahoo.com with NNFMP; 25 Feb 2013 02:45:32 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1361760332; bh=qEc0b79qHxbGQZaKPHHWFG286Xh2O6RGmqTTcS+GF2w=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:Received:From:Mime-Version:Content-Type:Subject:Date:In-Reply-To:To:References:Message-Id:X-Mailer; b=IwilzyauNBI9tqpv6lkWwdZmmNgBQdsWAZkp4+7kQ39MxYbVOeokJCgxD1yYZGEnKv6UrCA6ZKFtLBbmUfNSZA9cLI/kI2gK7CTWaQiaQoWwP0bxF8UkJr7lpebQhp/FYIAHiTym7HKMgGRa01t9/WZsAfU+fq4Gok75SfiCOwU= X-Yahoo-Newman-Id: 486755.75304.bm@smtp107.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: FqQ_6E8VM1ms28WhKKbAuA98rYLIeWQ33EApa5bGFH6OX6V ULCkzK5py2QYudPCly0nP90nGEuR4qXS.m8vNuk9etEmFhKShWx.A1VUAWF2 m.4kh6yu3M5H.X7XLvfyMxC9PVEJL7vk2zG5F_YPULExsbfzoEKobUpljh7. 8DZrSa9.f0CrCcXMSLeq3a3cMxn.AjO0BXN7lriwHmxKej9uuiKIlcbyDEwo 35pADBVzkFtQvPGuL1ZafCBZDW1lbH4__HZD1L1nWCQSmF1EDf.YhRkaRiZW 6ithMviMNyzWIk4HQFcwOhHtCvTOZl44b6ZJRp0mS._Jmzuq_SiByp9AMXU. C_gicS31AHjiwZjT37ZmGYOHwOeevhWZUqBZVwU1QEYQW8esHiW2Ky7nMpHe ghLchubXEp57Th6jUnhkbvBVIeQP6Kg_Rc6Sug.0bxKGvabEbKvtxmz8kFV8 TH8IDZYYwtmqikdx_HDaDZ7h55d3AkSsyVdHg2OgAiDYPWfrbCPW2m1dmtiF k7sm.4P4JonxdEAuiSI2hCkRR5eSU9LficNM4DBvs9qUaLv3wqv5oOPXn207 Rcgd8c6m7n55jjWf7IFUFw2d1xJumjTZJMK_V.0TtOd56VKMsQQU1OC637pe yoIPoLcYoaEkI0t6RNhjWoFKIK4Yzy1Av6.VpsFoE1lQ_f7S2rx.waZPDou5 zftXABu.TvCnkixMKkZtIWfoOpay2pjas5ys9Fh15xup3nsNaULI3eHBKxzo J7Bi0 X-Yahoo-SMTP: t0UN_U2swBCFgwLIRu70LU92TrvpdQ-- Received: from [192.168.1.2] (mtheroux2@76.118.248.45 with plain) by smtp107.mail.bf1.yahoo.com with SMTP; 24 Feb 2013 18:45:32 -0800 PST From: Michael Theroux Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/alternative; boundary="Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52" Subject: Re: Size Tiered -> Leveled Compaction Date: Sun, 24 Feb 2013 21:45:31 -0500 In-Reply-To: <7280397B-1C35-4E48-9D34-EBC33F88DA82@thelastpickle.com> To: user@cassandra.apache.org References: <1361152640.33111.GenericBBA@web160905.mail.bf1.yahoo.com> <5127B159.5070201@yahoo.com> <7280397B-1C35-4E48-9D34-EBC33F88DA82@thelastpickle.com> Message-Id: <3AD5B7F7-D589-4041-B310-AD4BB0CDE7FF@yahoo.com> X-Mailer: Apple Mail (2.1283) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Aaron, Thanks for the response. I think I speak for many Cassandra users when = I say we greatly appreciate your help with our questions and issues. = For the specific bug I mentioned, I found this comment : = http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released "Automatic fixing of overlapping leveled sstables = (CASSANDRA-4644)" Although I had difficulty putting 2 and 2 together from the comments in = 4644 (it mentioned being fixed in 1.1.6, but also being not = reproducible). We converted two column families yesterday (two we believe would be = particularly well suited for Leveled Compaction). We have two more to = convert, but those will wait until next weekend. So far no issues, and, = we've seen some positive results. To help answer some of my own questions I posed in this thread, and = others have expressed interest in knowing, the steps we followed were: 1) Perform the proper alter table command: ALTER TABLE X WITH = compaction_strategy_class=3D'LeveledCompactionStrategy' AND = compaction_strategy_options:sstable_size_in_mb=3D10; 2) Ran compact on all nodes nodetool compact X=20 We converted one column family at a time, and temporarily disabled some = maintenance activities we perform to decrease load while we converted = column families, as the compaction was resource heavy and I didn't wish = to interfere with our operational activities as much as possible. In = our case, the compaction after altering the schema, took about an hour = and a half. Thus far, it appears everything worked without a hitch. I chose 10 mb = for the SSTABLE size, based on Wei's feedback (who's data size is on-par = with ours), and other tid-bits I found through searching. Based on = issues people have reported in the relatively distant past. I made sure = that we've been handling the compaction load properly, and I've run test = repairs on the specific tables we converted. We also tested restarting = a node after the conversion. Again, I believe the tables we converted were particularly well suited = for Leveled Compaction. These particular column families were = situations where reads outstripped writes by an order of magnitude or = two. So far, our results have been very positive. We've seen a greater than = 50% reduction in read I/O, and a large improvement in performance for = some activities. We've also seen an improvement in memory utilization. = I imagine other's mileage may vary. If everything is stable over the next week, we will convert the last two = tables we are considering for Leveled Compaction. Thanks again! -Mike On Feb 24, 2013, at 8:56 PM, aaron morton wrote: > If you did not use LCS until after the upgrade to 1.1.9 I think you = are ok.=20 >=20 > If in doubt the steps here look like they helped = https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommentId=3D13= 456137&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:comment-tabp= anel#comment-13456137 >=20 > Cheers >=20 > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand >=20 > @aaronmorton > http://www.thelastpickle.com >=20 > On 23/02/2013, at 6:56 AM, Mike wrote: >=20 >> Hello, >>=20 >> Still doing research before we potentially move one of our column = families from Size Tiered->Leveled compaction this weekend. I was doing = some research around some of the bugs that were filed against leveled = compaction in Cassandra and I found this: >>=20 >> https://issues.apache.org/jira/browse/CASSANDRA-4644 >>=20 >> The bug mentions: >>=20 >> "You need to run the offline scrub (bin/sstablescrub) to fix the = sstable overlapping problem from early 1.1 releases. (Running with -m to = just check for overlaps between sstables should be fine, since you = already scrubbed online which will catch out-of-order within an = sstable.)" >>=20 >> We recently upgraded from 1.1.2 to 1.1.9. >>=20 >> Does anyone know if an offline scrub is recommended to be performed = when switching from STCS->LCS after upgrading from 1.1.2? >>=20 >> Any insight would be appreciated, >> Thanks, >> -Mike >>=20 >> On 2/17/2013 8:57 PM, Wei Zhu wrote: >>> We doubled the SStable size to 10M. It still generates a lot of = SSTable and we don't see much difference of the read latency. We are = able to finish the compactions after repair within serveral hours. We = will increase the SSTable size again if we feel the number of SSTable = hurts the performance. >>>=20 >>> ----- Original Message ----- >>> From: "Mike" >>> To: user@cassandra.apache.org >>> Sent: Sunday, February 17, 2013 4:50:40 AM >>> Subject: Re: Size Tiered -> Leveled Compaction >>>=20 >>>=20 >>> Hello Wei, >>>=20 >>> First thanks for this response. >>>=20 >>> Out of curiosity, what SSTable size did you choose for your usecase, = and what made you decide on that number? >>>=20 >>> Thanks, >>> -Mike >>>=20 >>> On 2/14/2013 3:51 PM, Wei Zhu wrote: >>>=20 >>>=20 >>>=20 >>>=20 >>> I haven't tried to switch compaction strategy. We started with LCS. >>>=20 >>>=20 >>> For us, after massive data imports (5000 w/seconds for 6 days), the = first repair is painful since there is quite some data inconsistency. = For 150G nodes, repair brought in about 30 G and created thousands of = pending compactions. It took almost a day to clear those. Just be = prepared LCS is really slow in 1.1.X. System performance degrades during = that time since reads could go to more SSTable, we see 20 SSTable lookup = for one read.. (We tried everything we can and couldn't speed it up. I = think it's single threaded.... and it's not recommended to turn on = multithread compaction. We even tried that, it didn't help )There is = parallel LCS in 1.2 which is supposed to alleviate the pain. Haven't = upgraded yet, hope it works:) >>>=20 >>>=20 >>> = http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2= >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> Since our cluster is not write intensive, only 100 w/seconds. I = don't see any pending compactions during regular operation. >>>=20 >>>=20 >>> One thing worth mentioning is the size of the SSTable, default is 5M = which is kind of small for 200G (all in one CF) data set, and we are on = SSD. It more than 150K files in one directory. (200G/5M =3D 40K SSTable = and each SSTable creates 4 files on disk) You might want to watch that = and decide the SSTable size. >>>=20 >>>=20 >>> By the way, there is no concept of Major compaction for LCS. Just = for fun, you can look at a file called $CFName.json in your data = directory and it tells you the SSTable distribution among different = levels. >>>=20 >>>=20 >>> -Wei >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> From: Charles Brophy >>> To: user@cassandra.apache.org >>> Sent: Thursday, February 14, 2013 8:29 AM >>> Subject: Re: Size Tiered -> Leveled Compaction >>>=20 >>>=20 >>> I second these questions: we've been looking into changing some of = our CFs to use leveled compaction as well. If anybody here has the = wisdom to answer them it would be of wonderful help. >>>=20 >>>=20 >>> Thanks >>> Charles >>>=20 >>>=20 >>> On Wed, Feb 13, 2013 at 7:50 AM, Mike < mtheroux2@yahoo.com > wrote: >>>=20 >>>=20 >>> Hello, >>>=20 >>> I'm investigating the transition of some of our column families from = Size Tiered -> Leveled Compaction. I believe we have some high-read-load = column families that would benefit tremendously. >>>=20 >>> I've stood up a test DB Node to investigate the transition. I = successfully alter the column family, and I immediately noticed a large = number (1000+) pending compaction tasks become available, but no = compaction get executed. >>>=20 >>> I tried running "nodetool sstableupgrade" on the column family, and = the compaction tasks don't move. >>>=20 >>> I also notice no changes to the size and distribution of the = existing SSTables. >>>=20 >>> I then run a major compaction on the column family. All pending = compaction tasks get run, and the SSTables have a distribution that I = would expect from LeveledCompaction (lots and lots of 10MB files). >>>=20 >>> Couple of questions: >>>=20 >>> 1) Is a major compaction required to transition from size-tiered to = leveled compaction? >>> 2) Are major compactions as much of a concern for LeveledCompaction = as their are for Size Tiered? >>>=20 >>> All the documentation I found concerning transitioning from Size = Tiered to Level compaction discuss the alter table cql command, but I = haven't found too much on what else needs to be done after the schema = change. >>>=20 >>> I did these tests with Cassandra 1.1.9. >>>=20 >>> Thanks, >>> -Mike >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>=20 >=20 --Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released

"Automatic fixing of overlapping = leveled sstables (CASSANDRA-4644)"

Although I = had difficulty putting 2 and 2 together from the comments in 4644 (it = mentioned being fixed in 1.1.6, but also being not = reproducible).

We converted two column families = yesterday (two we believe would be particularly well suited for Leveled = Compaction).  We have two more to convert, but those will wait = until next weekend.  So far no issues, and, we've seen some = positive results.

To help answer some of my own = questions I posed in this thread, and others have expressed interest in = knowing, the steps we followed were:

1) Perform = the proper alter table command:

ALTER = TABLE X WITH compaction_strategy_class=3D'LeveledCompactionStrategy' AND =  compaction_strategy_options:sstable_size_in_mb=3D10;

<= /div>
2) Ran compact on all nodes

nodetool = compact <keyspace> X 

We converted = one column family at a time, and temporarily disabled some maintenance = activities we perform to decrease load while we converted column = families, as the compaction was resource heavy and I didn't wish to = interfere with our operational activities as much as possible.   =  In our case, the compaction after altering the schema, took about = an hour and a half.

Thus far, it appears = everything worked without a hitch.  I chose 10 mb for the SSTABLE = size, based on Wei's feedback (who's data size is on-par with ours), and = other tid-bits I found through searching.  Based on issues people = have reported in the relatively distant past. I made sure that we've = been handling the compaction load properly, and I've run test repairs on = the specific tables we converted.  We also tested restarting a node = after the conversion.

Again, I believe the = tables we converted were particularly well suited for Leveled = Compaction.  These particular column families were situations where = reads outstripped writes by an order of magnitude or = two.

So far, our results have been very = positive.  We've seen a greater than 50% reduction in read I/O, and = a large improvement in performance for some activities.  We've also = seen an improvement in memory utilization.  I imagine other's = mileage may vary.

If everything is stable over = the next week, we will convert the last two tables we are considering = for Leveled Compaction.

Thanks = again!
-Mike

On Feb 24, 2013, at 8:56 = PM, aaron morton wrote:

If = you did not use LCS until after the upgrade to 1.1.9 I think you are = ok. 


Cheers

http://www.thelastpickle.com

On 23/02/2013, at 6:56 AM, Mike <mtheroux2@yahoo.com> = wrote:

Hello,

Still doing research before we potentially = move one of our column families from Size Tiered->Leveled compaction = this weekend.  I was doing some research around some of the bugs = that were filed against leveled compaction in Cassandra and I found = this:

https://issu= es.apache.org/jira/browse/CASSANDRA-4644

The bug = mentions:

"You need to run the offline scrub (bin/sstablescrub) = to fix the sstable overlapping problem from early 1.1 releases. (Running = with -m to just check for overlaps between sstables should be fine, = since you already scrubbed online which will catch out-of-order within = an sstable.)"

We recently upgraded from 1.1.2 to = 1.1.9.

Does anyone know if an offline scrub is recommended to be = performed when switching from STCS->LCS after upgrading from = 1.1.2?

Any insight would be = appreciated,
Thanks,
-Mike

On 2/17/2013 8:57 PM, Wei Zhu = wrote:
We doubled the SStable size to 10M. = It still generates a lot of SSTable and we don't see much difference of = the read latency.  We are able to finish the compactions after = repair within serveral hours. We will increase the SSTable size again if = we feel the number of SSTable hurts the performance.

----- = Original Message -----
From: "Mike" <mtheroux2@yahoo.com>
To: = user@cassandra.apache.orgSent: Sunday, February 17, 2013 4:50:40 AM
Subject: Re: Size Tiered = -> Leveled Compaction


Hello Wei,

First thanks for = this response.

Out of curiosity, what SSTable size did you choose = for your usecase, and what made you decide on that = number?

Thanks,
-Mike

On 2/14/2013 3:51 PM, Wei Zhu = wrote:




I haven't tried to switch compaction strategy. = We started with LCS.


For us, after massive data imports (5000 = w/seconds for 6 days), the first repair is painful since there is quite = some data inconsistency. For 150G nodes, repair brought in about 30 G = and created thousands of pending compactions. It took almost a day to = clear those. Just be prepared LCS is really slow in 1.1.X. System = performance degrades during that time since reads could go to more = SSTable, we see 20 SSTable lookup for one read.. (We tried everything we = can and couldn't speed it up. I think it's single threaded.... and it's = not recommended to turn on multithread compaction. We even tried that, = it didn't help )There is parallel LCS in 1.2 which is supposed to = alleviate the pain. Haven't upgraded yet, hope it works:)


http://www.datastax.com/dev/blog/performance-improvements-in-cas= sandra-1-2





Since our cluster is not write = intensive, only 100 w/seconds. I don't see any pending compactions = during regular operation.


One thing worth mentioning is the = size of the SSTable, default is 5M which is kind of small for 200G (all = in one CF) data set, and we are on SSD. It more than 150K files in one = directory. (200G/5M =3D 40K SSTable and each SSTable creates 4 files on = disk) You might want to watch that and decide the SSTable = size.


By the way, there is no concept of Major compaction for = LCS. Just for fun, you can look at a file called $CFName.json in your = data directory and it tells you the SSTable distribution among different = levels.


-Wei





From: Charles Brophy = <cbrophy@zulily.com>
To: user@cassandra.apache.org
Sent: = Thursday, February 14, 2013 8:29 AM
Subject: Re: Size Tiered -> = Leveled Compaction


I second these questions: we've been = looking into changing some of our CFs to use leveled compaction as well. = If anybody here has the wisdom to answer them it would be of wonderful = help.


Thanks
Charles


On Wed, Feb 13, 2013 at = 7:50 AM, Mike < mtheroux2@yahoo.com > = wrote:


Hello,

I'm investigating the transition of some = of our column families from Size Tiered -> Leveled Compaction. I = believe we have some high-read-load column families that would benefit = tremendously.

I've stood up a test DB Node to investigate the = transition. I successfully alter the column family, and I immediately = noticed a large number (1000+) pending compaction tasks become = available, but no compaction get executed.

I tried running = "nodetool sstableupgrade" on the column family, and the compaction tasks = don't move.

I also notice no changes to the size and distribution = of the existing SSTables.

I then run a major compaction on the = column family. All pending compaction tasks get run, and the SSTables = have a distribution that I would expect from LeveledCompaction (lots and = lots of 10MB files).

Couple of questions:

1) Is a major = compaction required to transition from size-tiered to leveled = compaction?
2) Are major compactions as much of a concern for = LeveledCompaction as their are for Size Tiered?

All the = documentation I found concerning transitioning from Size Tiered to Level = compaction discuss the alter table cql command, but I haven't found too = much on what else needs to be done after the schema change.

I did = these tests with Cassandra = 1.1.9.

Thanks,
-Mike








= --Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52--