Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of daniel.y.woo@gmail.com
 designates 209.85.217.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CACHzRHZNK5roh4UyZrN_LZ00YxfYQARXTGFR_=rpyb7YC7fq3g@mail.gmail.com>
References: 
 <CAF-kt2sLhyKiiko2DKdLUDxipv-Dmez1p_kue+ZFFmLKb1Nr3w@mail.gmail.com>
	<CACHzRHZNK5roh4UyZrN_LZ00YxfYQARXTGFR_=rpyb7YC7fq3g@mail.gmail.com>
Date: Fri, 12 Oct 2012 16:26:37 +0800
Message-ID: 
 <CAF-kt2sRP-kqqnkPAnoO+TWk7yJDAcbjD7Uieep0C2idCmW4xQ@mail.gmail.com>
Subject: Re: cassandra 1.0.8 memory usage
From: Daniel Woo <daniel.y.woo@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=f46d0401fe5940592c04cbd872b9

--f46d0401fe5940592c04cbd872b9
Content-Type: text/plain; charset=UTF-8

Hi Rob,

>>What version of Cassandra? What JVM? Are JNA and Jamm working?
cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works.

>>It sounds like the two nodes that are pathological right now have
exhausted the perm gen with actual non-garbage, probably mostly the  Bloom
filters and the JMX MBeans.
JMAP shows that the per gen is only 40% used.

>>Do you have a "large" number of ColumnFamilies? How large is the data
stored per node?
I have very few column families, maybe 30-50. The nodetool shows each node
has 5 GB load.

>> Disable swap for cassandra node
I am gonna change swappiness to 20%

Thanks,
Daniel


On Fri, Oct 12, 2012 at 2:02 AM, Rob Coli <rcoli@palominodb.com> wrote:

> On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo <daniel.y.woo@gmail.com>
> wrote:
> > I am running a mini cluster with 6 nodes, recently we see very frequent
> > ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it
> takes
> > 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client
> throws
> > SocketTimeoutException every 3 minutes.
>
> What version of Cassandra? What JVM? Are JNA and Jamm working?
>
> > I checked the load, it seems well balanced, and the two nodes are
> running on
> > the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G
> > heap, including 800MB young generation. We did not see any swap usage
> during
> > the GC, any idea about this?
>
> It sounds like the two nodes that are pathological right now have
> exhausted the perm gen with actual non-garbage, probably mostly the
> Bloom filters and the JMX MBeans.
>
> > Then I took a heap dump, it shows that 5 instances of JmxMBeanServer
> holds
> > 500MB memory and most of the referenced objects are JMX mbean related,
> it's
> > kind of wired to me and looks like a memory leak.
>
> Do you have a "large" number of ColumnFamilies? How large is the data
> stored per node?
>
> =Rob
>
> --
> =Robert Coli
> AIM&GTALK - rcoli@palominodb.com
> YAHOO - rcoli.palominob
> SKYPE - rcoli_palominodb
>


-- 
Thanks & Regards,
Daniel

--f46d0401fe5940592c04cbd872b9
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Rob,<br><br>&gt;&gt;What version of Cassandra? What JVM? Are JNA and Jam=
m working?<br>cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, j=
amm works.<br><br>&gt;&gt;It sounds like the two nodes that are pathologica=
l right now have exhausted the perm gen with actual non-garbage, probably m=
ostly the=C2=A0
Bloom filters and the JMX MBeans.<br>
JMAP shows that the per gen is only 40% used.<br><br>&gt;&gt;Do you have a =
&quot;large&quot; number of ColumnFamilies? How large is the data stored pe=
r node?<br>I have very few column families, maybe 30-50. The nodetool shows=
 each node has 5 GB load.<br>
<br>&gt;&gt; Disable swap for cassandra node<br>I am gonna change swappines=
s to 20%<br>=C2=A0<br>Thanks,<br>Daniel<br><br><br><div class=3D"gmail_quot=
e">On Fri, Oct 12, 2012 at 2:02 AM, Rob Coli <span dir=3D"ltr">&lt;<a href=
=3D"mailto:rcoli@palominodb.com" target=3D"_blank">rcoli@palominodb.com</a>=
&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">On Wed, Oct 10, 2012 at 11=
:04 PM, Daniel Woo &lt;<a href=3D"mailto:daniel.y.woo@gmail.com">daniel.y.w=
oo@gmail.com</a>&gt; wrote:<br>

&gt; I am running a mini cluster with 6 nodes, recently we see very frequen=
t<br>
&gt; ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it =
takes<br>
&gt; 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client t=
hrows<br>
&gt; SocketTimeoutException every 3 minutes.<br>
<br>
</div>What version of Cassandra? What JVM? Are JNA and Jamm working?<br>
<div class=3D"im"><br>
&gt; I checked the load, it seems well balanced, and the two nodes are runn=
ing on<br>
&gt; the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4=
G<br>
&gt; heap, including 800MB young generation. We did not see any swap usage =
during<br>
&gt; the GC, any idea about this?<br>
<br>
</div>It sounds like the two nodes that are pathological right now have<br>
exhausted the perm gen with actual non-garbage, probably mostly the<br>
Bloom filters and the JMX MBeans.<br>
<div class=3D"im"><br>
&gt; Then I took a heap dump, it shows that 5 instances of JmxMBeanServer h=
olds<br>
&gt; 500MB memory and most of the referenced objects are JMX mbean related,=
 it&#39;s<br>
&gt; kind of wired to me and looks like a memory leak.<br>
<br>
</div>Do you have a &quot;large&quot; number of ColumnFamilies? How large i=
s the data<br>
stored per node?<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
=3DRob<br>
<br>
--<br>
=3DRobert Coli<br>
AIM&amp;GTALK - <a href=3D"mailto:rcoli@palominodb.com">rcoli@palominodb.co=
m</a><br>
YAHOO - rcoli.palominob<br>
SKYPE - rcoli_palominodb<br>
</font></span></blockquote></div><br><br clear=3D"all"><br>-- <br>Thanks &a=
mp; Regards,<br>Daniel<br>

--f46d0401fe5940592c04cbd872b9--