Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: "Hiller, Dean" <Dean.Hiller@nrel.gov>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Mon, 25 Feb 2013 07:47:41 -0700
Subject: Re: Size Tiered -> Leveled Compaction
Thread-Topic: Size Tiered -> Leveled Compaction
Thread-Index: Ac4TZwgxJinfsGnKSs6ejCWB4P6e9Q==
Message-ID: <CD50C795.2136A%Dean.Hiller@nrel.gov>
In-Reply-To: 
 <CA+VSrLr6duGsswvqZZjXF5ZAFzX5fkgYT7Q2RTqXQ3G8Wg_vig@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/14.2.5.121010
acceptlanguage: en-US
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Sweet, thanks for the info.
Dean

From: Alain RODRIGUEZ <arodrime@gmail.com<mailto:arodrime@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <us=
er@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday, February 25, 2013 7:41 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cas=
sandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Size Tiered -> Leveled Compaction

"After running a major compaction, automatic minor compactions are no longe=
r triggered,"

... Because of the size difference between the big sstable generated and th=
e new sstable flushed/compacted. Compactions are not stopped, they are just=
 "no longer triggered" for a while.

"frequently requiring you to manually run major compactions on a routine ba=
sis"

... In order to keep a good read latency. If you don't run compaction perio=
dically and you have some row update, you will have an increasing amount of=
 rows spread across various sstable. But my guess is that if you have no de=
lete, no update and no ttl but only write once row, you may keep this big t=
able uncompacted for as long as you want without any read performance degra=
dation.

I think the documentation just don't go deep enough in the explanation, or =
maybe this information already exists somewhere else in the documentation.

Wait a confirmation of an expert, I am just an humble user.

Alain


2013/2/25 Hiller, Dean <Dean.Hiller@nrel.gov<mailto:Dean.Hiller@nrel.gov>>
So what you are saying is this documentation is not quite accurate then=85.=
(I am more confused between your statement and the documentation now)

http://www.datastax.com/docs/1.1/operations/tuning

Which says "After running a major compaction, automatic minor compactions a=
re no longer triggered, frequently requiring you to manually run major comp=
actions on a routine basis"

Which implied that you have to keep running major compactions and minor com=
pactions are not kicking in anymore :( :( and we(my project) want minor com=
pactions to continue.

Thanks,
Dean


From: Alain RODRIGUEZ <arodrime@gmail.com<mailto:arodrime@gmail.com><mailto=
:arodrime@gmail.com<mailto:arodrime@gmail.com>>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mail=
to:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cass=
andra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.ap=
ache.org<mailto:user@cassandra.apache.org>>>
Date: Monday, February 25, 2013 7:15 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:use=
r@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.=
apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.o=
rg<mailto:user@cassandra.apache.org>>>
Subject: Re: Size Tiered -> Leveled Compaction

"I am confused.  I thought running compact turns off the minor compactions =
and users are actually supposed to run upgradesstables????  (maybe I am on =
old documentation?)"

Well, that's not true. What happens is that compaction use sstables with an=
 aproximate same size. So if you run a major compaction on a 10GB CF, you h=
ave almost no chance of getting that (big) sstable compacted again. You wil=
l have to wait for other sstables to reach this size or run an other major =
compaction.

But anyways, this doesn't apply here because we are speaking of LCS (levele=
d compaction strategy), which runs differently from the traditional STC (si=
zed tier compaction).

Not sure about it, but you may run upgradesstable or compaction to rebuild =
your sstable after switching from STC  to LCS, I mean both methods trigger =
an initialization of LCS on old sstables.

Alain


2013/2/25 Hiller, Dean <Dean.Hiller@nrel.gov<mailto:Dean.Hiller@nrel.gov><m=
ailto:Dean.Hiller@nrel.gov<mailto:Dean.Hiller@nrel.gov>>>
I am confused.  I thought running compact turns off the minor compactions a=
nd users are actually supposed to run upgradesstables????  (maybe I am on o=
ld documentation?)

Can someone verify that?

Thanks,
Dean

From: Michael Theroux <mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com><mail=
to:mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com>><mailto:mtheroux2@yahoo.=
com<mailto:mtheroux2@yahoo.com><mailto:mtheroux2@yahoo.com<mailto:mtheroux2=
@yahoo.com>>>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mail=
to:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user=
@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassand=
ra.apache.org<mailto:user@cassandra.apache.org>>>" <user@cassandra.apache.o=
rg<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailt=
o:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@=
cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandr=
a.apache.org>>>>
Date: Sunday, February 24, 2013 7:45 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:use=
r@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassa=
ndra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apa=
che.org<mailto:user@cassandra.apache.org>>>" <user@cassandra.apache.org<mai=
lto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user=
@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassan=
dra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apac=
he.org>>>>
Subject: Re: Size Tiered -> Leveled Compaction

Aaron,

Thanks for the response.  I think I speak for many Cassandra users when I s=
ay we greatly appreciate your help with our questions and issues.  For the =
specific bug I mentioned, I found this comment :

http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released

"Automatic fixing of overlapping leveled sstables (CASSANDRA-4644)"

Although I had difficulty putting 2 and 2 together from the comments in 464=
4 (it mentioned being fixed in 1.1.6, but also being not reproducible).

We converted two column families yesterday (two we believe would be particu=
larly well suited for Leveled Compaction).  We have two more to convert, bu=
t those will wait until next weekend.  So far no issues, and, we've seen so=
me positive results.

To help answer some of my own questions I posed in this thread, and others =
have expressed interest in knowing, the steps we followed were:

1) Perform the proper alter table command:

ALTER TABLE X WITH compaction_strategy_class=3D'LeveledCompactionStrategy' =
AND  compaction_strategy_options:sstable_size_in_mb=3D10;

2) Ran compact on all nodes

nodetool compact <keyspace> X

We converted one column family at a time, and temporarily disabled some mai=
ntenance activities we perform to decrease load while we converted column f=
amilies, as the compaction was resource heavy and I didn't wish to interfer=
e with our operational activities as much as possible.    In our case, the =
compaction after altering the schema, took about an hour and a half.

Thus far, it appears everything worked without a hitch.  I chose 10 mb for =
the SSTABLE size, based on Wei's feedback (who's data size is on-par with o=
urs), and other tid-bits I found through searching.  Based on issues people=
 have reported in the relatively distant past. I made sure that we've been =
handling the compaction load properly, and I've run test repairs on the spe=
cific tables we converted.  We also tested restarting a node after the conv=
ersion.

Again, I believe the tables we converted were particularly well suited for =
Leveled Compaction.  These particular column families were situations where=
 reads outstripped writes by an order of magnitude or two.

So far, our results have been very positive.  We've seen a greater than 50%=
 reduction in read I/O, and a large improvement in performance for some act=
ivities.  We've also seen an improvement in memory utilization.  I imagine =
other's mileage may vary.

If everything is stable over the next week, we will convert the last two ta=
bles we are considering for Leveled Compaction.

Thanks again!
-Mike

On Feb 24, 2013, at 8:56 PM, aaron morton wrote:

If you did not use LCS until after the upgrade to 1.1.9 I think you are ok.

If in doubt the steps here look like they helped https://issues.apache.org/=
jira/browse/CASSANDRA-4644?focusedCommentId=3D13456137&page=3Dcom.atlassian=
.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456137

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 23/02/2013, at 6:56 AM, Mike <mtheroux2@yahoo.com<mailto:mtheroux2@yahoo=
.com><mailto:mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com>><mailto:mthero=
ux2@yahoo.com<mailto:mtheroux2@yahoo.com><mailto:mtheroux2@yahoo.com<mailto=
:mtheroux2@yahoo.com>>>> wrote:

Hello,

Still doing research before we potentially move one of our column families =
from Size Tiered->Leveled compaction this weekend.  I was doing some resear=
ch around some of the bugs that were filed against leveled compaction in Ca=
ssandra and I found this:

https://issues.apache.org/jira/browse/CASSANDRA-4644

The bug mentions:

"You need to run the offline scrub (bin/sstablescrub) to fix the sstable ov=
erlapping problem from early 1.1 releases. (Running with -m to just check f=
or overlaps between sstables should be fine, since you already scrubbed onl=
ine which will catch out-of-order within an sstable.)"

We recently upgraded from 1.1.2 to 1.1.9.

Does anyone know if an offline scrub is recommended to be performed when sw=
itching from STCS->LCS after upgrading from 1.1.2?

Any insight would be appreciated,
Thanks,
-Mike

On 2/17/2013 8:57 PM, Wei Zhu wrote:
We doubled the SStable size to 10M. It still generates a lot of SSTable and=
 we don't see much difference of the read latency.  We are able to finish t=
he compactions after repair within serveral hours. We will increase the SST=
able size again if we feel the number of SSTable hurts the performance.

----- Original Message -----
From: "Mike" <mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com><mailto:mthero=
ux2@yahoo.com<mailto:mtheroux2@yahoo.com>><mailto:mtheroux2@yahoo.com<mailt=
o:mtheroux2@yahoo.com><mailto:mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.co=
m>>>>
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user=
@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassan=
dra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apac=
he.org<mailto:user@cassandra.apache.org>>>
Sent: Sunday, February 17, 2013 4:50:40 AM
Subject: Re: Size Tiered -> Leveled Compaction


Hello Wei,

First thanks for this response.

Out of curiosity, what SSTable size did you choose for your usecase, and wh=
at made you decide on that number?

Thanks,
-Mike

On 2/14/2013 3:51 PM, Wei Zhu wrote:


I haven't tried to switch compaction strategy. We started with LCS.


For us, after massive data imports (5000 w/seconds for 6 days), the first r=
epair is painful since there is quite some data inconsistency. For 150G nod=
es, repair brought in about 30 G and created thousands of pending compactio=
ns. It took almost a day to clear those. Just be prepared LCS is really slo=
w in 1.1.X. System performance degrades during that time since reads could =
go to more SSTable, we see 20 SSTable lookup for one read.. (We tried every=
thing we can and couldn't speed it up. I think it's single threaded.... and=
 it's not recommended to turn on multithread compaction. We even tried that=
, it didn't help )There is parallel LCS in 1.2 which is supposed to allevia=
te the pain. Haven't upgraded yet, hope it works:)


http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2


Since our cluster is not write intensive, only 100 w/seconds. I don't see a=
ny pending compactions during regular operation.


One thing worth mentioning is the size of the SSTable, default is 5M which =
is kind of small for 200G (all in one CF) data set, and we are on SSD. It m=
ore than 150K files in one directory. (200G/5M =3D 40K SSTable and each SST=
able creates 4 files on disk) You might want to watch that and decide the S=
STable size.


By the way, there is no concept of Major compaction for LCS. Just for fun, =
you can look at a file called $CFName.json in your data directory and it te=
lls you the SSTable distribution among different levels.


-Wei


From: Charles Brophy <cbrophy@zulily.com<mailto:cbrophy@zulily.com><mailto:=
cbrophy@zulily.com<mailto:cbrophy@zulily.com>><mailto:cbrophy@zulily.com<ma=
ilto:cbrophy@zulily.com><mailto:cbrophy@zulily.com<mailto:cbrophy@zulily.co=
m>>>>
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user=
@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassan=
dra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apac=
he.org<mailto:user@cassandra.apache.org>>>
Sent: Thursday, February 14, 2013 8:29 AM
Subject: Re: Size Tiered -> Leveled Compaction


I second these questions: we've been looking into changing some of our CFs =
to use leveled compaction as well. If anybody here has the wisdom to answer=
 them it would be of wonderful help.


Thanks
Charles


On Wed, Feb 13, 2013 at 7:50 AM, Mike < mtheroux2@yahoo.com<mailto:mtheroux=
2@yahoo.com><mailto:mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com>><mailto=
:mtheroux2@yahoo.com<mailto:mtheroux2@yahoo.com><mailto:mtheroux2@yahoo.com=
<mailto:mtheroux2@yahoo.com>>> > wrote:


Hello,

I'm investigating the transition of some of our column families from Size T=
iered -> Leveled Compaction. I believe we have some high-read-load column f=
amilies that would benefit tremendously.

I've stood up a test DB Node to investigate the transition. I successfully =
alter the column family, and I immediately noticed a large number (1000+) p=
ending compaction tasks become available, but no compaction get executed.

I tried running "nodetool sstableupgrade" on the column family, and the com=
paction tasks don't move.

I also notice no changes to the size and distribution of the existing SSTab=
les.

I then run a major compaction on the column family. All pending compaction =
tasks get run, and the SSTables have a distribution that I would expect fro=
m LeveledCompaction (lots and lots of 10MB files).

Couple of questions:

1) Is a major compaction required to transition from size-tiered to leveled=
 compaction?
2) Are major compactions as much of a concern for LeveledCompaction as thei=
r are for Size Tiered?

All the documentation I found concerning transitioning from Size Tiered to =
Level compaction discuss the alter table cql command, but I haven't found t=
oo much on what else needs to be done after the schema change.

I did these tests with Cassandra 1.1.9.

Thanks,
-Mike