Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: Michael Theroux <mtheroux2@yahoo.com>
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_6F105BC5-5BA9-4B42-BBDA-01F8200E9CA5"
Subject: Re: About the heap
Date: Thu, 14 Mar 2013 11:44:44 -0400
In-Reply-To: <0FA477C0-0841-4E3D-9872-2400192C9645@thelastpickle.com>
To: user@cassandra.apache.org
References: <1363202347.80717.GenericBBA@web160906.mail.bf1.yahoo.com>
 <0FA477C0-0841-4E3D-9872-2400192C9645@thelastpickle.com>
Message-Id: <6B1CE6BF-25E4-4F8F-903E-FE1892C44618@yahoo.com>


--Apple-Mail=_6F105BC5-5BA9-4B42-BBDA-01F8200E9CA5
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hi Aaron,

If you have the chance, could you expand on m1.xlarge being the much =
better choice?  We are going to need to make a choice of expanding from =
a 12 node -> 24 node cluster using .large instances, vs. upgrading all =
instances to m1.xlarge, soon and the justifications would be helpful =
(although Aaron says so does help ;) ). =20

One obvious reason is administrating a 24 node cluster does add =
person-time overhead. =20

Another reason includes less impact of maintenance activities such as =
repair, as these activites have significant CPU overhead.  Doubling the =
cluster size would, in theory, halve the time for this overhead, but =
would still impact performance during that time.  Going to xlarge would =
lessen the impact of these activities on operations.

Anything else?

Thanks,

-Mike

On Mar 14, 2013, at 9:27 AM, aaron morton wrote:

>> Because of this I have an unstable cluster and have no other choice =
than use Amazon EC2 xLarge instances when we would rather use twice more =
EC2 Large nodes.
> m1.xlarge is a MUCH better choice than m1.large.
> You get more ram and better IO and less steal. Using half as many =
m1.xlarge is the way to go.=20
>=20
>> My heap is actually changing from 3-4 GB to 6 GB and sometimes =
growing to the max 8 GB (crashing the node).
> How is it crashing ?
> Are you getting too much GC or running OOM ?=20
> Are you using the default GC configuration ?
> Is cassandra logging a lot of GC warnings ?
>=20
> If you are running OOM then something has to change. Maybe bloom =
filters, maybe caches.
>=20
> Enable the GC logging in cassandra-env.sh to check how low a CMS =
compaction get's the heap, or use some other tool. That will give an =
idea of how much memory you are using.=20
>=20
> Here is some background on what is kept on heap in pre 1.2
> http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html
>=20
> Cheers
>=20
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>=20
> @aaronmorton
> http://www.thelastpickle.com
>=20
> On 13/03/2013, at 12:19 PM, Wei Zhu <wz1975@yahoo.com> wrote:
>=20
>> Here is the JIRA I submitted regarding the ancestor.
>>=20
>> https://issues.apache.org/jira/browse/CASSANDRA-5342
>>=20
>> -Wei
>>=20
>>=20
>> ----- Original Message -----
>> From: "Wei Zhu" <wz1975@yahoo.com>
>> To: user@cassandra.apache.org
>> Sent: Wednesday, March 13, 2013 11:35:29 AM
>> Subject: Re: About the heap
>>=20
>> Hi Dean,
>> The index_interval is controlling the sampling of the SSTable to =
speed up the lookup of the keys in the SSTable. Here is the code:
>>=20
>> =
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassand=
ra/db/DataTracker.java#L478
>>=20
>> To increase the interval meaning, taking less samples, less memory, =
slower lookup for read.
>>=20
>> I did do a heap dump on my production system which caused about 10 =
seconds pause of the node. I found something interesting, for LCS, it =
could involve thousands of SSTables for one compaction, the ancestors =
are recorded in case something goes wrong during the compaction. But =
those are never removed after the compaction is done. In our case, it =
takes about 1G of heap memory to store that. I am going to submit a JIRA =
for that.=20
>>=20
>> Here is the culprit:
>>=20
>> =
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassand=
ra/io/sstable/SSTableMetadata.java#L58
>>=20
>> Enjoy looking at Cassandra code:)
>>=20
>> -Wei
>>=20
>>=20
>> ----- Original Message -----
>> From: "Dean Hiller" <Dean.Hiller@nrel.gov>
>> To: user@cassandra.apache.org
>> Sent: Wednesday, March 13, 2013 11:11:14 AM
>> Subject: Re: About the heap
>>=20
>> Going to 1.2.2 helped us quite a bit as well as turning on LCS from =
STCS which gave us smaller bloomfilters.
>>=20
>> As far as key cache.  There is an entry in cassandra.yaml called =
index_interval set to 128.  I am not sure if that is related to =
key_cache.  I think it is.  By turning that to 512 or maybe even 1024, =
you will consume less ram there as well though I ran this test in QA and =
my key cache size stayed the same so I am really not sure(I am actually =
checking out cassandra code now to dig a little deeper into this =
property.
>>=20
>> Dean
>>=20
>> From: Alain RODRIGUEZ <arodrime@gmail.com<mailto:arodrime@gmail.com>>
>> Reply-To: =
"user@cassandra.apache.org<mailto:user@cassandra.apache.org>" =
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Date: Wednesday, March 13, 2013 10:11 AM
>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" =
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Subject: About the heap
>>=20
>> Hi,
>>=20
>> I would like to know everything that is in the heap.
>>=20
>> We are here speaking of C*1.1.6
>>=20
>> Theory :
>>=20
>> - Memtable (1024 MB)
>> - Key Cache (100 MB)
>> - Row Cache (disabled, and serialized with JNA activated anyway, so =
should be off-heap)
>> - BloomFilters (about 1,03 GB - from cfstats, adding all the "Bloom =
Filter Space Used" and considering they are showed in Bytes - =
1103765112)
>> - Anything else ?
>>=20
>> So my heap should be fluctuating between 1,15 GB and 2.15 GB and =
growing slowly (from the new BF of my new data).
>>=20
>> My heap is actually changing from 3-4 GB to 6 GB and sometimes =
growing to the max 8 GB (crashing the node).
>>=20
>> Because of this I have an unstable cluster and have no other choice =
than use Amazon EC2 xLarge instances when we would rather use twice more =
EC2 Large nodes.
>>=20
>> What am I missing ?
>>=20
>> Practice :
>>=20
>> Is there a way not inducing any load and easy to do to dump the heap =
to analyse it with MAT (or anything else that you could advice) ?
>>=20
>> Alain
>>=20
>>=20
>=20


--Apple-Mail=_6F105BC5-5BA9-4B42-BBDA-01F8200E9CA5
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi =
Aaron,<div><br></div><div>If you have the chance, could you expand on =
m1.xlarge being the much better choice? &nbsp;We are going to need to =
make a choice of expanding from a 12 node -&gt; 24 node cluster using =
.large instances, vs. upgrading all instances to m1.xlarge, soon and the =
justifications would be helpful (although Aaron says so does help ;) ). =
&nbsp;</div><div><br></div><div>One obvious reason is administrating a =
24 node cluster does add person-time overhead. =
&nbsp;</div><div><br></div><div>Another reason includes less impact of =
maintenance activities such as repair, as these activites have =
significant CPU overhead. &nbsp;Doubling the cluster size would, in =
theory, halve the time for this overhead, but would still impact =
performance during that time. &nbsp;Going to xlarge would lessen the =
impact of these activities on =
operations.</div><div><br></div><div>Anything =
else?</div><div><br></div><div>Thanks,</div><div><br></div><div>-Mike</div=
><div><br><div><div>On Mar 14, 2013, at 9:27 AM, aaron morton =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dus-ascii"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div><blockquote type=3D"cite">Because of this I have an unstable =
cluster and have no other choice than use Amazon EC2 xLarge instances =
when we would rather use twice more EC2 Large =
nodes.<br></blockquote>m1.xlarge is a MUCH better choice than =
m1.large.</div><div>You get more ram and better IO and less steal. Using =
half as many m1.xlarge is the way to =
go.&nbsp;</div><div><br></div><div><blockquote type=3D"cite">My heap is =
actually changing from 3-4 GB to 6 GB and sometimes growing to the max 8 =
GB (crashing the node).<br></blockquote>How is it crashing =
?</div><div>Are you getting too much GC or running OOM =
?&nbsp;</div><div>Are you using the default GC configuration =
?</div><div>Is cassandra logging a lot of GC warnings =
?</div><div><br></div><div>If you are running OOM then something has to =
change. Maybe bloom filters, maybe =
caches.</div><div><br></div><div>Enable the GC logging in =
cassandra-env.sh to check how low a CMS compaction get's the heap, or =
use some other tool. That will give an idea of how much memory you are =
using.&nbsp;</div><div><br></div><div>Here is some background on what is =
kept on heap in pre 1.2</div><a =
href=3D"http://www.mail-archive.com/user@cassandra.apache.org/msg25762.htm=
l">http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html</a>=
<div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<div style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: =
medium; font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div =
style=3D"color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; =
font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
-webkit-auto; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; =
-webkit-text-stroke-width: 0px; word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Cassandra Consultant</div><div>New =
Zealand</div><div><br></div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></span></div></span></div></span></div></span></div></div>
</div>

<br><div><div>On 13/03/2013, at 12:19 PM, Wei Zhu &lt;<a =
href=3D"mailto:wz1975@yahoo.com">wz1975@yahoo.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">Here is the JIRA I submitted regarding the =
ancestor.<br><br><a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-5342">https://issu=
es.apache.org/jira/browse/CASSANDRA-5342</a><br><br>-Wei<br><br><br>----- =
Original Message -----<br>From: "Wei Zhu" &lt;<a =
href=3D"mailto:wz1975@yahoo.com">wz1975@yahoo.com</a>&gt;<br>To: <a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a><br=
>Sent: Wednesday, March 13, 2013 11:35:29 AM<br>Subject: Re: About the =
heap<br><br>Hi Dean,<br>The index_interval is controlling the sampling =
of the SSTable to speed up the lookup of the keys in the SSTable. Here =
is the code:<br><br><a =
href=3D"https://github.com/apache/cassandra/blob/trunk/src/java/org/apache=
/cassandra/db/DataTracker.java#L478">https://github.com/apache/cassandra/b=
lob/trunk/src/java/org/apache/cassandra/db/DataTracker.java#L478</a><br><b=
r>To increase the interval meaning, taking less samples, less memory, =
slower lookup for read.<br><br>I did do a heap dump on my production =
system which caused about 10 seconds pause of the node. I found =
something interesting, for LCS, it could involve thousands of SSTables =
for one compaction, the ancestors are recorded in case something goes =
wrong during the compaction. But those are never removed after the =
compaction is done. In our case, it takes about 1G of heap memory to =
store that. I am going to submit a JIRA for that. <br><br>Here is the =
culprit:<br><br>https://github.com/apache/cassandra/blob/trunk/src/java/or=
g/apache/cassandra/io/sstable/SSTableMetadata.java#L58<br><br>Enjoy =
looking at Cassandra code:)<br><br>-Wei<br><br><br>----- Original =
Message -----<br>From: "Dean Hiller" &lt;Dean.Hiller@nrel.gov&gt;<br>To: =
user@cassandra.apache.org<br>Sent: Wednesday, March 13, 2013 11:11:14 =
AM<br>Subject: Re: About the heap<br><br>Going to 1.2.2 helped us quite =
a bit as well as turning on LCS from STCS which gave us smaller =
bloomfilters.<br><br>As far as key cache. &nbsp;There is an entry in =
cassandra.yaml called index_interval set to 128. &nbsp;I am not sure if =
that is related to key_cache. &nbsp;I think it is. &nbsp;By turning that =
to 512 or maybe even 1024, you will consume less ram there as well =
though I ran this test in QA and my key cache size stayed the same so I =
am really not sure(I am actually checking out cassandra code now to dig =
a little deeper into this property.<br><br>Dean<br><br>From: Alain =
RODRIGUEZ =
&lt;arodrime@gmail.com&lt;mailto:arodrime@gmail.com&gt;&gt;<br>Reply-To: =
"user@cassandra.apache.org&lt;mailto:user@cassandra.apache.org&gt;" =
&lt;user@cassandra.apache.org&lt;mailto:user@cassandra.apache.org&gt;&gt;<=
br>Date: Wednesday, March 13, 2013 10:11 AM<br>To: =
"user@cassandra.apache.org&lt;mailto:user@cassandra.apache.org&gt;" =
&lt;user@cassandra.apache.org&lt;mailto:user@cassandra.apache.org&gt;&gt;<=
br>Subject: About the heap<br><br>Hi,<br><br>I would like to know =
everything that is in the heap.<br><br>We are here speaking of =
C*1.1.6<br><br>Theory :<br><br>- Memtable (1024 MB)<br>- Key Cache (100 =
MB)<br>- Row Cache (disabled, and serialized with JNA activated anyway, =
so should be off-heap)<br>- BloomFilters (about 1,03 GB - from cfstats, =
adding all the "Bloom Filter Space Used" and considering they are showed =
in Bytes - 1103765112)<br>- Anything else ?<br><br>So my heap should be =
fluctuating between 1,15 GB and 2.15 GB and growing slowly (from the new =
BF of my new data).<br><br>My heap is actually changing from 3-4 GB to 6 =
GB and sometimes growing to the max 8 GB (crashing the =
node).<br><br>Because of this I have an unstable cluster and have no =
other choice than use Amazon EC2 xLarge instances when we would rather =
use twice more EC2 Large nodes.<br><br>What am I missing =
?<br><br>Practice :<br><br>Is there a way not inducing any load and easy =
to do to dump the heap to analyse it with MAT (or anything else that you =
could advice) =
?<br><br>Alain<br><br><br></blockquote></div><br></div></div></blockquote>=
</div><br></div></body></html>=

--Apple-Mail=_6F105BC5-5BA9-4B42-BBDA-01F8200E9CA5--