Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of santal.li@gmail.com designates
 209.85.221.199 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=TF5CyYUjgGIkdfJIdgTq1MJkge79bjClIt4/BqwA8FKQ9xNAohXRhxOVjAM1LWUDHT
         1/GahSUN+0bexIFpd2O9xcISkPvsrhQON21Emw2Zpxm4PRR+ul35+FjKQlfMsNmr/hVD
         mFWjUFRX4/zWBmWhDAZ5jSXz4G93H3OZhFNy4=
MIME-Version: 1.0
In-Reply-To: <5f7770581002161028s21c54cdfk846e2c973d06aaf6@mail.gmail.com>
References: <a4f278e51002160625i40f58ac6v9d2ad532bf108fae@mail.gmail.com>
	 <5f7770581002161028s21c54cdfk846e2c973d06aaf6@mail.gmail.com>
Date: Sat, 20 Feb 2010 11:40:46 +0800
Message-ID: <f91bc4fd1002191940w4c1e685dt16b6e16afd8e7cb7@mail.gmail.com>
Subject: Re: cassandra freezes
From: Santal Li <santal.li@gmail.com>
To: cassandra-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=0016364eea362b2ea1047ffff715

--0016364eea362b2ea1047ffff715
Content-Type: text/plain; charset=ISO-8859-1

I meet almost same thing as you. When I do some benchmarks write test, some
times one Cassandra will freeze and other node will consider it was shutdown
and up after 30+ second. I am using 5 node, each node 8G mem for java heap.

>From my investigate, it was caused by GC thread, because I start the
JConsole and monitor with the memory heap usage, each time when the GC
happend, heap usage will drop down from 6G to 1G, and check the casandra
log, I found the freeze happend at exactly same times.

So I think when using huge memory(>2G), maybe need using some different GC
stratege other than the default one provide by Cassandra lunch script.
Dose't anyone meet this situation, can you please provide some guide?


Thanks
-Santal

2010/2/17 Tatu Saloranta <tsaloranta@gmail.com>

> On Tue, Feb 16, 2010 at 6:25 AM, Boris Shulman <shulmanb@gmail.com> wrote:
> > Hello, I'm running some benchmarks on 2 cassandra nodes each running
> > on 8 cores machine with 16G RAM, 10G for Java heap. I've noticed that
> > during benchmarks with numerous writes cassandra just freeze for
> > several minutes (in those benchmarks I'm writing batches of 10 columns
> > with 1K data each for every key in a single CF). Usually after
> > performing 50K writes I'm getting a TimeOutException and cassandra
> > just freezes. What configuration changes can I make in order to
> > prevent this? Is it possible that my setup just can't handle the load?
> > How can I calculate the number of casandra nodes for a desired load?
>
> One thing that can cause seeming lockups is garbage collector. So
> enabling GC debug output would be heplful, to see GC activity. Some
> collector (CMS specifically) can stop the system for very long time,
> up to minutes. This is not necessarily the root cause, but is easy to
> rule out.
> Beyond this, getting a stack trace during lockup would make sense.
> That can pinpoint what threads are doing, or what they are blocked on
> in case there is a deadlock or heavy contention on some shared
> resource.
>
> -+ Tatu +-
>

--0016364eea362b2ea1047ffff715
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I meet almost same thing as you. When I do some benchmarks write test, some=
 times one Cassandra will freeze and other node will consider it was shutdo=
wn and up after 30+ second. I am using 5 node, each node 8G mem for java he=
ap.<br>
<br>From my investigate, it was caused by GC thread, because I start the JC=
onsole and monitor with the memory heap usage, each time when the GC happen=
d, heap usage will drop down from 6G to 1G, and check the casandra log, I f=
ound the freeze happend at exactly same times.<br>
<br>So I think when using huge memory(&gt;2G), maybe need using some differ=
ent GC stratege other than the default one provide by Cassandra lunch scrip=
t. Dose&#39;t anyone meet this situation, can you please provide some guide=
?<br>
<br><br>Thanks<br>-Santal<br><br><div class=3D"gmail_quote">2010/2/17 Tatu =
Saloranta <span dir=3D"ltr">&lt;<a href=3D"mailto:tsaloranta@gmail.com">tsa=
loranta@gmail.com</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=
=3D"border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; p=
adding-left: 1ex;">
<div><div></div><div class=3D"h5">On Tue, Feb 16, 2010 at 6:25 AM, Boris Sh=
ulman &lt;<a href=3D"mailto:shulmanb@gmail.com">shulmanb@gmail.com</a>&gt; =
wrote:<br>
&gt; Hello, I&#39;m running some benchmarks on 2 cassandra nodes each runni=
ng<br>
&gt; on 8 cores machine with 16G RAM, 10G for Java heap. I&#39;ve noticed t=
hat<br>
&gt; during benchmarks with numerous writes cassandra just freeze for<br>
&gt; several minutes (in those benchmarks I&#39;m writing batches of 10 col=
umns<br>
&gt; with 1K data each for every key in a single CF). Usually after<br>
&gt; performing 50K writes I&#39;m getting a TimeOutException and cassandra=
<br>
&gt; just freezes. What configuration changes can I make in order to<br>
&gt; prevent this? Is it possible that my setup just can&#39;t handle the l=
oad?<br>
&gt; How can I calculate the number of casandra nodes for a desired load?<b=
r>
<br>
</div></div>One thing that can cause seeming lockups is garbage collector. =
So<br>
enabling GC debug output would be heplful, to see GC activity. Some<br>
collector (CMS specifically) can stop the system for very long time,<br>
up to minutes. This is not necessarily the root cause, but is easy to<br>
rule out.<br>
Beyond this, getting a stack trace during lockup would make sense.<br>
That can pinpoint what threads are doing, or what they are blocked on<br>
in case there is a deadlock or heavy contention on some shared<br>
resource.<br>
<font color=3D"#888888"><br>
-+ Tatu +-<br>
</font></blockquote></div><br>

--0016364eea362b2ea1047ffff715--