Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: Michael Theroux <mtheroux2@yahoo.com>
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52"
Subject: Re: Size Tiered -> Leveled Compaction
Date: Sun, 24 Feb 2013 21:45:31 -0500
In-Reply-To: <7280397B-1C35-4E48-9D34-EBC33F88DA82@thelastpickle.com>
To: user@cassandra.apache.org
References: <1361152640.33111.GenericBBA@web160905.mail.bf1.yahoo.com>
 <5127B159.5070201@yahoo.com>
 <7280397B-1C35-4E48-9D34-EBC33F88DA82@thelastpickle.com>
Message-Id: <3AD5B7F7-D589-4041-B310-AD4BB0CDE7FF@yahoo.com>


--Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

Aaron,

Thanks for the response.  I think I speak for many Cassandra users when =
I say we greatly appreciate your help with our questions and issues.  =
For the specific bug I mentioned, I found this comment :

	=
http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released

	"Automatic fixing of overlapping leveled sstables =
(CASSANDRA-4644)"

Although I had difficulty putting 2 and 2 together from the comments in =
4644 (it mentioned being fixed in 1.1.6, but also being not =
reproducible).

We converted two column families yesterday (two we believe would be =
particularly well suited for Leveled Compaction).  We have two more to =
convert, but those will wait until next weekend.  So far no issues, and, =
we've seen some positive results.

To help answer some of my own questions I posed in this thread, and =
others have expressed interest in knowing, the steps we followed were:

1) Perform the proper alter table command:

	ALTER TABLE X WITH =
compaction_strategy_class=3D'LeveledCompactionStrategy' AND  =
compaction_strategy_options:sstable_size_in_mb=3D10;

2) Ran compact on all nodes

	nodetool compact <keyspace> X=20

We converted one column family at a time, and temporarily disabled some =
maintenance activities we perform to decrease load while we converted =
column families, as the compaction was resource heavy and I didn't wish =
to interfere with our operational activities as much as possible.    In =
our case, the compaction after altering the schema, took about an hour =
and a half.

Thus far, it appears everything worked without a hitch.  I chose 10 mb =
for the SSTABLE size, based on Wei's feedback (who's data size is on-par =
with ours), and other tid-bits I found through searching.  Based on =
issues people have reported in the relatively distant past. I made sure =
that we've been handling the compaction load properly, and I've run test =
repairs on the specific tables we converted.  We also tested restarting =
a node after the conversion.

Again, I believe the tables we converted were particularly well suited =
for Leveled Compaction.  These particular column families were =
situations where reads outstripped writes by an order of magnitude or =
two.

So far, our results have been very positive.  We've seen a greater than =
50% reduction in read I/O, and a large improvement in performance for =
some activities.  We've also seen an improvement in memory utilization.  =
I imagine other's mileage may vary.

If everything is stable over the next week, we will convert the last two =
tables we are considering for Leveled Compaction.

Thanks again!
-Mike

On Feb 24, 2013, at 8:56 PM, aaron morton wrote:

> If you did not use LCS until after the upgrade to 1.1.9 I think you =
are ok.=20
>=20
> If in doubt the steps here look like they helped =
https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommentId=3D13=
456137&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:comment-tabp=
anel#comment-13456137
>=20
> Cheers
>=20
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>=20
> @aaronmorton
> http://www.thelastpickle.com
>=20
> On 23/02/2013, at 6:56 AM, Mike <mtheroux2@yahoo.com> wrote:
>=20
>> Hello,
>>=20
>> Still doing research before we potentially move one of our column =
families from Size Tiered->Leveled compaction this weekend.  I was doing =
some research around some of the bugs that were filed against leveled =
compaction in Cassandra and I found this:
>>=20
>> https://issues.apache.org/jira/browse/CASSANDRA-4644
>>=20
>> The bug mentions:
>>=20
>> "You need to run the offline scrub (bin/sstablescrub) to fix the =
sstable overlapping problem from early 1.1 releases. (Running with -m to =
just check for overlaps between sstables should be fine, since you =
already scrubbed online which will catch out-of-order within an =
sstable.)"
>>=20
>> We recently upgraded from 1.1.2 to 1.1.9.
>>=20
>> Does anyone know if an offline scrub is recommended to be performed =
when switching from STCS->LCS after upgrading from 1.1.2?
>>=20
>> Any insight would be appreciated,
>> Thanks,
>> -Mike
>>=20
>> On 2/17/2013 8:57 PM, Wei Zhu wrote:
>>> We doubled the SStable size to 10M. It still generates a lot of =
SSTable and we don't see much difference of the read latency.  We are =
able to finish the compactions after repair within serveral hours. We =
will increase the SSTable size again if we feel the number of SSTable =
hurts the performance.
>>>=20
>>> ----- Original Message -----
>>> From: "Mike" <mtheroux2@yahoo.com>
>>> To: user@cassandra.apache.org
>>> Sent: Sunday, February 17, 2013 4:50:40 AM
>>> Subject: Re: Size Tiered -> Leveled Compaction
>>>=20
>>>=20
>>> Hello Wei,
>>>=20
>>> First thanks for this response.
>>>=20
>>> Out of curiosity, what SSTable size did you choose for your usecase, =
and what made you decide on that number?
>>>=20
>>> Thanks,
>>> -Mike
>>>=20
>>> On 2/14/2013 3:51 PM, Wei Zhu wrote:
>>>=20
>>>=20
>>>=20
>>>=20
>>> I haven't tried to switch compaction strategy. We started with LCS.
>>>=20
>>>=20
>>> For us, after massive data imports (5000 w/seconds for 6 days), the =
first repair is painful since there is quite some data inconsistency. =
For 150G nodes, repair brought in about 30 G and created thousands of =
pending compactions. It took almost a day to clear those. Just be =
prepared LCS is really slow in 1.1.X. System performance degrades during =
that time since reads could go to more SSTable, we see 20 SSTable lookup =
for one read.. (We tried everything we can and couldn't speed it up. I =
think it's single threaded.... and it's not recommended to turn on =
multithread compaction. We even tried that, it didn't help )There is =
parallel LCS in 1.2 which is supposed to alleviate the pain. Haven't =
upgraded yet, hope it works:)
>>>=20
>>>=20
>>> =
http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2=

>>>=20
>>>=20
>>>=20
>>>=20
>>>=20
>>> Since our cluster is not write intensive, only 100 w/seconds. I =
don't see any pending compactions during regular operation.
>>>=20
>>>=20
>>> One thing worth mentioning is the size of the SSTable, default is 5M =
which is kind of small for 200G (all in one CF) data set, and we are on =
SSD. It more than 150K files in one directory. (200G/5M =3D 40K SSTable =
and each SSTable creates 4 files on disk) You might want to watch that =
and decide the SSTable size.
>>>=20
>>>=20
>>> By the way, there is no concept of Major compaction for LCS. Just =
for fun, you can look at a file called $CFName.json in your data =
directory and it tells you the SSTable distribution among different =
levels.
>>>=20
>>>=20
>>> -Wei
>>>=20
>>>=20
>>>=20
>>>=20
>>>=20
>>> From: Charles Brophy <cbrophy@zulily.com>
>>> To: user@cassandra.apache.org
>>> Sent: Thursday, February 14, 2013 8:29 AM
>>> Subject: Re: Size Tiered -> Leveled Compaction
>>>=20
>>>=20
>>> I second these questions: we've been looking into changing some of =
our CFs to use leveled compaction as well. If anybody here has the =
wisdom to answer them it would be of wonderful help.
>>>=20
>>>=20
>>> Thanks
>>> Charles
>>>=20
>>>=20
>>> On Wed, Feb 13, 2013 at 7:50 AM, Mike < mtheroux2@yahoo.com > wrote:
>>>=20
>>>=20
>>> Hello,
>>>=20
>>> I'm investigating the transition of some of our column families from =
Size Tiered -> Leveled Compaction. I believe we have some high-read-load =
column families that would benefit tremendously.
>>>=20
>>> I've stood up a test DB Node to investigate the transition. I =
successfully alter the column family, and I immediately noticed a large =
number (1000+) pending compaction tasks become available, but no =
compaction get executed.
>>>=20
>>> I tried running "nodetool sstableupgrade" on the column family, and =
the compaction tasks don't move.
>>>=20
>>> I also notice no changes to the size and distribution of the =
existing SSTables.
>>>=20
>>> I then run a major compaction on the column family. All pending =
compaction tasks get run, and the SSTables have a distribution that I =
would expect from LeveledCompaction (lots and lots of 10MB files).
>>>=20
>>> Couple of questions:
>>>=20
>>> 1) Is a major compaction required to transition from size-tiered to =
leveled compaction?
>>> 2) Are major compactions as much of a concern for LeveledCompaction =
as their are for Size Tiered?
>>>=20
>>> All the documentation I found concerning transitioning from Size =
Tiered to Level compaction discuss the alter table cql command, but I =
haven't found too much on what else needs to be done after the schema =
change.
>>>=20
>>> I did these tests with Cassandra 1.1.9.
>>>=20
>>> Thanks,
>>> -Mike
>>>=20
>>>=20
>>>=20
>>>=20
>>>=20
>>=20
>=20


--Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
">Aaron,<div><br></div><div>Thanks for the response. &nbsp;I think I =
speak for many Cassandra users when I say we greatly appreciate your =
help with our questions and issues. &nbsp;For the specific bug I =
mentioned, I found this comment :</div><div><br></div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span><a =
href=3D"http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released"=
>http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released</a></di=
v><div><br></div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>"Automatic fixing of overlapping =
leveled sstables (CASSANDRA-4644)"</div><div><br></div><div>Although I =
had difficulty putting 2 and 2 together from the comments in 4644 (it =
mentioned being fixed in 1.1.6, but also being not =
reproducible).</div><div><br></div><div>We converted two column families =
yesterday (two we believe would be particularly well suited for Leveled =
Compaction). &nbsp;We have two more to convert, but those will wait =
until next weekend. &nbsp;So far no issues, and, we've seen some =
positive results.</div><div><br></div><div>To help answer some of my own =
questions I posed in this thread, and others have expressed interest in =
knowing, the steps we followed were:</div><div><br></div><div>1) Perform =
the proper alter table command:</div><div><br></div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>ALTER =
TABLE X WITH compaction_strategy_class=3D'LeveledCompactionStrategy' AND =
&nbsp;compaction_strategy_options:sstable_size_in_mb=3D10;</div><div><br><=
/div><div>2) Ran compact on all nodes</div><div><br></div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>nodetool =
compact &lt;keyspace&gt; X&nbsp;</div><div><br></div><div>We converted =
one column family at a time, and temporarily disabled some maintenance =
activities we perform to decrease load while we converted column =
families, as the compaction was resource heavy and I didn't wish to =
interfere with our operational activities as much as possible. &nbsp; =
&nbsp;In our case, the compaction after altering the schema, took about =
an hour and a half.</div><div><br></div><div>Thus far, it appears =
everything worked without a hitch. &nbsp;I chose 10 mb for the SSTABLE =
size, based on Wei's feedback (who's data size is on-par with ours), and =
other tid-bits I found through searching. &nbsp;Based on issues people =
have reported in the relatively distant past. I made sure that we've =
been handling the compaction load properly, and I've run test repairs on =
the specific tables we converted. &nbsp;We also tested restarting a node =
after the conversion.</div><div><br></div><div>Again, I believe the =
tables we converted were particularly well suited for Leveled =
Compaction. &nbsp;These particular column families were situations where =
reads outstripped writes by an order of magnitude or =
two.</div><div><br></div><div>So far, our results have been very =
positive. &nbsp;We've seen a greater than 50% reduction in read I/O, and =
a large improvement in performance for some activities. &nbsp;We've also =
seen an improvement in memory utilization. &nbsp;I imagine other's =
mileage may vary.</div><div><br></div><div>If everything is stable over =
the next week, we will convert the last two tables we are considering =
for Leveled Compaction.</div><div><br></div><div>Thanks =
again!</div><div>-Mike</div><div><br></div><div>On Feb 24, 2013, at 8:56 =
PM, aaron morton wrote:</div><div><div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><meta =
http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">If =
you did not use LCS until after the upgrade to 1.1.9 I think you are =
ok.&nbsp;<div><br></div><div>If in doubt the steps here look like they =
helped&nbsp;<a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommen=
tId=3D13456137&amp;page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:=
comment-tabpanel#comment-13456137">https://issues.apache.org/jira/browse/C=
ASSANDRA-4644?focusedCommentId=3D13456137&amp;page=3Dcom.atlassian.jira.pl=
ugin.system.issuetabpanels:comment-tabpanel#comment-13456137</a></div><div=
><br></div><div>Cheers</div><div><br></div><div><div =
apple-content-edited=3D"true">
<div style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: =
medium; font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Cassandra Developer</div><div>New =
Zealand</div><div><br></div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></span></div></span></div></div></div>
</div>

<br><div><div>On 23/02/2013, at 6:56 AM, Mike &lt;<a =
href=3D"mailto:mtheroux2@yahoo.com">mtheroux2@yahoo.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">Hello,<br><br>Still doing research before we potentially =
move one of our column families from Size Tiered-&gt;Leveled compaction =
this weekend. &nbsp;I was doing some research around some of the bugs =
that were filed against leveled compaction in Cassandra and I found =
this:<br><br><a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-4644">https://issu=
es.apache.org/jira/browse/CASSANDRA-4644</a><br><br>The bug =
mentions:<br><br>"You need to run the offline scrub (bin/sstablescrub) =
to fix the sstable overlapping problem from early 1.1 releases. (Running =
with -m to just check for overlaps between sstables should be fine, =
since you already scrubbed online which will catch out-of-order within =
an sstable.)"<br><br>We recently upgraded from 1.1.2 to =
1.1.9.<br><br>Does anyone know if an offline scrub is recommended to be =
performed when switching from STCS-&gt;LCS after upgrading from =
1.1.2?<br><br>Any insight would be =
appreciated,<br>Thanks,<br>-Mike<br><br>On 2/17/2013 8:57 PM, Wei Zhu =
wrote:<br><blockquote type=3D"cite">We doubled the SStable size to 10M. =
It still generates a lot of SSTable and we don't see much difference of =
the read latency. &nbsp;We are able to finish the compactions after =
repair within serveral hours. We will increase the SSTable size again if =
we feel the number of SSTable hurts the performance.<br><br>----- =
Original Message -----<br>From: "Mike" &lt;<a =
href=3D"mailto:mtheroux2@yahoo.com">mtheroux2@yahoo.com</a>&gt;<br>To: =
<a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a><br=
>Sent: Sunday, February 17, 2013 4:50:40 AM<br>Subject: Re: Size Tiered =
-&gt; Leveled Compaction<br><br><br>Hello Wei,<br><br>First thanks for =
this response.<br><br>Out of curiosity, what SSTable size did you choose =
for your usecase, and what made you decide on that =
number?<br><br>Thanks,<br>-Mike<br><br>On 2/14/2013 3:51 PM, Wei Zhu =
wrote:<br><br><br><br><br>I haven't tried to switch compaction strategy. =
We started with LCS.<br><br><br>For us, after massive data imports (5000 =
w/seconds for 6 days), the first repair is painful since there is quite =
some data inconsistency. For 150G nodes, repair brought in about 30 G =
and created thousands of pending compactions. It took almost a day to =
clear those. Just be prepared LCS is really slow in 1.1.X. System =
performance degrades during that time since reads could go to more =
SSTable, we see 20 SSTable lookup for one read.. (We tried everything we =
can and couldn't speed it up. I think it's single threaded.... and it's =
not recommended to turn on multithread compaction. We even tried that, =
it didn't help )There is parallel LCS in 1.2 which is supposed to =
alleviate the pain. Haven't upgraded yet, hope it works:)<br><br><br><a =
href=3D"http://www.datastax.com/dev/blog/performance-improvements-in-cassa=
ndra-1-2">http://www.datastax.com/dev/blog/performance-improvements-in-cas=
sandra-1-2</a><br><br><br><br><br><br>Since our cluster is not write =
intensive, only 100 w/seconds. I don't see any pending compactions =
during regular operation.<br><br><br>One thing worth mentioning is the =
size of the SSTable, default is 5M which is kind of small for 200G (all =
in one CF) data set, and we are on SSD. It more than 150K files in one =
directory. (200G/5M =3D 40K SSTable and each SSTable creates 4 files on =
disk) You might want to watch that and decide the SSTable =
size.<br><br><br>By the way, there is no concept of Major compaction for =
LCS. Just for fun, you can look at a file called $CFName.json in your =
data directory and it tells you the SSTable distribution among different =
levels.<br><br><br>-Wei<br><br><br><br><br><br>From: Charles Brophy =
&lt;cbrophy@zulily.com&gt;<br>To: user@cassandra.apache.org<br>Sent: =
Thursday, February 14, 2013 8:29 AM<br>Subject: Re: Size Tiered -&gt; =
Leveled Compaction<br><br><br>I second these questions: we've been =
looking into changing some of our CFs to use leveled compaction as well. =
If anybody here has the wisdom to answer them it would be of wonderful =
help.<br><br><br>Thanks<br>Charles<br><br><br>On Wed, Feb 13, 2013 at =
7:50 AM, Mike &lt; mtheroux2@yahoo.com &gt; =
wrote:<br><br><br>Hello,<br><br>I'm investigating the transition of some =
of our column families from Size Tiered -&gt; Leveled Compaction. I =
believe we have some high-read-load column families that would benefit =
tremendously.<br><br>I've stood up a test DB Node to investigate the =
transition. I successfully alter the column family, and I immediately =
noticed a large number (1000+) pending compaction tasks become =
available, but no compaction get executed.<br><br>I tried running =
"nodetool sstableupgrade" on the column family, and the compaction tasks =
don't move.<br><br>I also notice no changes to the size and distribution =
of the existing SSTables.<br><br>I then run a major compaction on the =
column family. All pending compaction tasks get run, and the SSTables =
have a distribution that I would expect from LeveledCompaction (lots and =
lots of 10MB files).<br><br>Couple of questions:<br><br>1) Is a major =
compaction required to transition from size-tiered to leveled =
compaction?<br>2) Are major compactions as much of a concern for =
LeveledCompaction as their are for Size Tiered?<br><br>All the =
documentation I found concerning transitioning from Size Tiered to Level =
compaction discuss the alter table cql command, but I haven't found too =
much on what else needs to be done after the schema change.<br><br>I did =
these tests with Cassandra =
1.1.9.<br><br>Thanks,<br>-Mike<br><br><br><br><br><br></blockquote><br></b=
lockquote></div><br></div></div></blockquote></div><br></div></body></html=
>=

--Apple-Mail=_09E81566-CA5C-4F6C-AA53-F2F205B75F52--