Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of feedlydev@gmail.com designates
 209.85.160.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAGNi5tn_AMKBWiEjESjPh_Cv9aGsj9LvnwG6vYK1ts7Aj_Ab=w@mail.gmail.com>
References: 
 <CAGNi5tmLWgMjZmEY_WbsPb0DUXH3zWCVfNCL9AjHOPEAuc=ukg@mail.gmail.com>
	<20120702121753.2A63.C3984673@terra.com.br>
	<CAGNi5tn_AMKBWiEjESjPh_Cv9aGsj9LvnwG6vYK1ts7Aj_Ab=w@mail.gmail.com>
Date: Mon, 2 Jul 2012 23:41:25 -0400
Message-ID: 
 <CAGNi5tnS+5qpyWJ7E-aMcBh-DbJ=vazd3vo-Gv+gkXUtom6Jyw@mail.gmail.com>
Subject: Re: frequent node up/downs
From: feedly team <feedlydev@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf301cc4ee67f5f104c3e4b069

--20cf301cc4ee67f5f104c3e4b069
Content-Type: text/plain; charset=ISO-8859-1

Couple more details. I confirmed that swap space is not being used (free -m
shows 0 swap) and cassandra.log has a message like "JNA mlockall
successful". top shows the process having 9g in resident memory but 21.6g
in virtual...What accounts for the much larger virtual number? some kind of
off-heap memory?

I'm a little puzzled as to why I would get such long pauses without
swapping. I uncommented all the gc logging options in cassandra-env.sh to
try to see what is going on when the node freezes.

Thanks
Kireet

On Mon, Jul 2, 2012 at 9:51 PM, feedly team <feedlydev@gmail.com> wrote:

> Yeah I noticed the leap second problem and ran the suggested fix, but I
> have been facing these problems before Saturday and still see the
> occasional failures after running the fix.
>
> Thanks.
>
>
> On Mon, Jul 2, 2012 at 11:17 AM, Marcus Both <mboth@terra.com.br> wrote:
>
>> Yeah! Look that.
>>
>> http://arstechnica.com/business/2012/07/one-day-later-the-leap-second-v-the-internet-scorecard/
>> I had the same problem. The solution was rebooting.
>>
>> On Mon, 2 Jul 2012 11:08:57 -0400
>> feedly team <feedlydev@gmail.com> wrote:
>>
>> > Hello,
>> >    I recently set up a 2 node cassandra cluster on dedicated hardware.
>> In
>> > the logs there have been a lot of "InetAddress xxx is now dead' or UP
>> > messages. Comparing the log messages between the 2 nodes, they seem to
>> > coincide with extremely long ParNew collections. I have seem some of up
>> to
>> > 50 seconds. The installation is pretty vanilla, I didn't change any
>> > settings and the machines don't seem particularly busy - cassandra is
>> the
>> > only thing running on the machine with an 8GB heap. The machine has
>> 64GB of
>> > RAM and CPU/IO usage looks pretty light. I do see a lot of 'Heap is xxx
>> > full. You may need to reduce memtable and/or cache sizes' messages.
>> Would
>> > this help with the long ParNew collections? That message seems to be
>> > triggered on a full collection.
>>
>> --
>> Marcus Both
>>
>>
>

--20cf301cc4ee67f5f104c3e4b069
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Couple more details. I confirmed that swap space is not being used (free -m=
 shows 0 swap) and cassandra.log has a message like &quot;JNA mlockall succ=
essful&quot;. top shows the process having 9g in resident memory but 21.6g =
in virtual...What accounts for the much larger virtual number? some kind of=
 off-heap memory?=A0<div>
<br></div><div>I&#39;m a little puzzled as to why I would get such long pau=
ses without swapping. I uncommented all the gc logging options in cassandra=
-env.sh to try to see what is going on when the node freezes.<br><div><br>
</div><div>Thanks</div><div>Kireet<br><div><br><div class=3D"gmail_quote">O=
n Mon, Jul 2, 2012 at 9:51 PM, feedly team <span dir=3D"ltr">&lt;<a href=3D=
"mailto:feedlydev@gmail.com" target=3D"_blank">feedlydev@gmail.com</a>&gt;<=
/span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Yeah I noticed the leap second problem and r=
an the suggested fix, but I have been facing these problems before Saturday=
 and still see the occasional failures after running the fix.=A0<div>
<br></div><div>Thanks.<div><div class=3D"h5"><br><br><div class=3D"gmail_qu=
ote">
On Mon, Jul 2, 2012 at 11:17 AM, Marcus Both <span dir=3D"ltr">&lt;<a href=
=3D"mailto:mboth@terra.com.br" target=3D"_blank">mboth@terra.com.br</a>&gt;=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">

Yeah! Look that.<br>
<a href=3D"http://arstechnica.com/business/2012/07/one-day-later-the-leap-s=
econd-v-the-internet-scorecard/" target=3D"_blank">http://arstechnica.com/b=
usiness/2012/07/one-day-later-the-leap-second-v-the-internet-scorecard/</a>=
<br>


I had the same problem. The solution was rebooting.<br>
<div><div><br>
On Mon, 2 Jul 2012 11:08:57 -0400<br>
feedly team &lt;<a href=3D"mailto:feedlydev@gmail.com" target=3D"_blank">fe=
edlydev@gmail.com</a>&gt; wrote:<br>
<br>
&gt; Hello,<br>
&gt; =A0 =A0I recently set up a 2 node cassandra cluster on dedicated hardw=
are. In<br>
&gt; the logs there have been a lot of &quot;InetAddress xxx is now dead=
9; or UP<br>
&gt; messages. Comparing the log messages between the 2 nodes, they seem to=
<br>
&gt; coincide with extremely long ParNew collections. I have seem some of u=
p to<br>
&gt; 50 seconds. The installation is pretty vanilla, I didn&#39;t change an=
y<br>
&gt; settings and the machines don&#39;t seem particularly busy - cassandra=
 is the<br>
&gt; only thing running on the machine with an 8GB heap. The machine has 64=
GB of<br>
&gt; RAM and CPU/IO usage looks pretty light. I do see a lot of &#39;Heap i=
s xxx<br>
&gt; full. You may need to reduce memtable and/or cache sizes&#39; messages=
. Would<br>
&gt; this help with the long ParNew collections? That message seems to be<b=
r>
&gt; triggered on a full collection.<br>
<br>
</div></div><span><font color=3D"#888888">--<br>
Marcus Both<br>
<br>
</font></span></blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div>

--20cf301cc4ee67f5f104c3e4b069--