Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of ruslan.usifov@gmail.com
 designates 209.85.212.170 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=FbIZVTak7gPK0j8Il2kKjdvBPvbynsnqrnHp0TZzTIcjyw96RvO+SpypQH4YUhmjj/
         3sknVQEzj6f7lVQ5HUTUh/Xj9BaaAEOryly5jCNLnUHEQ1DedT5bi1XEWQui2nQ4o7zm
         wFCr/DtuWz8nL4XDmmzotz6CuD/UJ6djTUuYY=
MIME-Version: 1.0
In-Reply-To: <AANLkTi=f9eJmbuMgoTb8s87fFur_vzKVrAjaDTXSU7iv@mail.gmail.com>
References: <AANLkTi=BcxNdpjALJufGuJN0y9Ykwi-0Kbabotf_=hgr@mail.gmail.com>
	<CB9DCAEA-8DAF-48BE-B098-2A6E8169ACBE@thelastpickle.com>
	<AANLkTinZifrFYK=Fp8WNArAo7WaND5GY1cY6qwauUC-8@mail.gmail.com>
	<AANLkTikWBBTQpRcbQcCSM2GL+ebphnGeqtCFnfpMU4k8@mail.gmail.com>
	<4D7553DD.1080702@yellowseo.com>
	<AANLkTi=QvGmb09XaJeF4pUt3u+zK=bF6GUZLu8Jc_zVz@mail.gmail.com>
	<AANLkTimu2wgK-ienzYUZ0fgCL=9myZATijVpqsTSSsWd@mail.gmail.com>
	<AANLkTi=f9eJmbuMgoTb8s87fFur_vzKVrAjaDTXSU7iv@mail.gmail.com>
Date: Tue, 8 Mar 2011 23:14:15 +0300
Message-ID: <AANLkTik=pDbWN_Owai69bR-Sk-7c5hY=h8UJ-TDS3MYi@mail.gmail.com>
Subject: Re: Nodes frozen in GC
From: ruslan usifov <ruslan.usifov@gmail.com>
To: Peter Schuller <peter.schuller@infidyne.com>, user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=00504502cc39a8b51b049dfe41eb

--00504502cc39a8b51b049dfe41eb
Content-Type: text/plain; charset=ISO-8859-1

2011/3/8 Peter Schuller <peter.schuller@infidyne.com>

> >                 $client->batch_mutate($mutations,
> > cassandra_ConsistencyLevel::QUORUM);
>
> Btw, what are the mutations? Are you doing something like inserting
> both very small values and very large ones?
>
> I have big xml file (5 GB) (mysql dump in xml format) and read data from it
with SAX xml parser, all records on that file looks like this:

        <row>
                <field name="uid">5</field>
                <field name="aid">3619780:1</field>
                <field name="cleanness">0</field>
                <field name="counter">7</field>
                <field name="gcount">0</field>
                <field name="lastchange">1291053619</field>
                <field name="disaster">0</field>
                <field name="tdisaster">0</field>
        </row>


mutations in that case is 10 similar records (follow fragment of code,
describes situation )

    $l_supercolumn = new cassandra_SuperColumn(array("name" =>
$l_row["aid"], "columns" => $l_columns));
    $l_c_or_sc = new cassandra_ColumnOrSuperColumn(array("super_column" =>
$l_supercolumn));
    $l_mutation = new cassandra_Mutation(array("column_or_supercolumn" =>
$l_c_or_sc));

    if(array_key_exists($l_key, $mutations))
    {
        array_push($mutations[$l_key]['aquarium_friend'], $l_mutation);
    }
    else
    {
        $mutations[$l_key] = array('aquarium_friend' => array($l_mutation));
    };

    if(!($l_i % 10))
    {
        make_mutation($client, $mutations, $g_loger, $g_rloger);
        $mutations = array();

        if(!($l_i % 1000))
        {
            $g_loger->info(sprintf("inserted: %s", $l_i));
        };
    };


> That's why I asked about the frequency. If you're doing a long-term
> stress test and seeing a 30 second pause once per week, that's a lot
> more likely to be "normal" for your workload than if you're seeing it
> happen once ever three minutes. The issue is that if you want to fix
> your problem, one must first figure out what the problem *is*. Based
> on past mailing list traffic, it seems most people's problems that are
> seemingly "due to GC" end up being because of a too high live set size
> or the CMS phase triggering too late. These are fixable issues if are
> running into them.
>
>
In may case this happen from time to time. For example insert all 5GB xml
took about 30-40 minutes, and nodes frozen about 5-10 time on that period
(avg time of frozen 15 secs)


> If all you have is a single column family with a 64 mb flush threshold
> and doing a bunch of insertions, and have a heap size of 5 (was it?)
> gig, you should not be having these issues. But stating that helps no
> one, which is why I'm asking for more information. (Widely
> extrapolating and suggesting that all Cassandra nodes will always
> freeze for 30 seconds every now and then is also helping no one, other
> than not being true.)
>
>  At initial state HEAP was 6GB. When i increase HEAP size to 7GB nodes
frozen only one time, but took much greater time (40 secs)

--00504502cc39a8b51b049dfe41eb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">2011/3/8 Peter Schuller <span dir=3D"ltr=
">&lt;<a href=3D"mailto:peter.schuller@infidyne.com">peter.schuller@infidyn=
e.com</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class=3D"im">&gt; =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 $client-&gt;=
batch_mutate($mutations,<br>
&gt; cassandra_ConsistencyLevel::QUORUM);<br>
<br>
</div>Btw, what are the mutations? Are you doing something like inserting<b=
r>
both very small values and very large ones?<br>
<br></blockquote><div>I have big xml file (5 GB) (mysql dump in xml format)=
 and read data from it with SAX xml parser, all records on that file looks =
like this:<br><br>=A0=A0=A0=A0=A0=A0=A0 &lt;row&gt;<br>=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;uid&quot;&gt;5&lt;/field=
&gt;<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;aid&qu=
ot;&gt;3619780:1&lt;/field&gt;<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0 &lt;field name=3D&quot;cleanness&quot;&gt;0&lt;/field&gt;<br>=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;counter&quot=
;&gt;7&lt;/field&gt;<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;gcount=
&quot;&gt;0&lt;/field&gt;<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =
&lt;field name=3D&quot;lastchange&quot;&gt;1291053619&lt;/field&gt;<br>=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;disaster&=
quot;&gt;0&lt;/field&gt;<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 &lt;field name=3D&quot;tdisas=
ter&quot;&gt;0&lt;/field&gt;<br>=A0=A0=A0=A0=A0=A0=A0 &lt;/row&gt;<br><br><=
br>mutations in that case is 10 similar records (follow fragment of code, d=
escribes situation )<br><br>=A0=A0=A0 $l_supercolumn =3D new cassandra_Supe=
rColumn(array(&quot;name&quot; =3D&gt; $l_row[&quot;aid&quot;], &quot;colum=
ns&quot; =3D&gt; $l_columns));<br>
=A0=A0=A0 $l_c_or_sc =3D new cassandra_ColumnOrSuperColumn(array(&quot;supe=
r_column&quot; =3D&gt; $l_supercolumn));<br>=A0=A0=A0 $l_mutation =3D new c=
assandra_Mutation(array(&quot;column_or_supercolumn&quot; =3D&gt; $l_c_or_s=
c));<br><br>=A0=A0=A0 if(array_key_exists($l_key, $mutations))<br>
=A0=A0=A0 {<br>=A0=A0=A0 =A0=A0=A0 array_push($mutations[$l_key][&#39;aquar=
ium_friend&#39;], $l_mutation);<br>=A0=A0=A0 }<br>=A0=A0=A0 else<br>=A0=A0=
=A0 {<br>=A0=A0=A0 =A0=A0=A0 $mutations[$l_key] =3D array(&#39;aquarium_fri=
end&#39; =3D&gt; array($l_mutation));<br>=A0=A0=A0 };<br>
<br>=A0=A0=A0 if(!($l_i % 10))<br>=A0=A0=A0 {<br>=A0=A0=A0 =A0=A0=A0 make_m=
utation($client, $mutations, $g_loger, $g_rloger);<br>=A0=A0=A0 =A0=A0=A0 $=
mutations =3D array();<br><br>=A0=A0=A0 =A0=A0=A0 if(!($l_i % 1000))<br>=A0=
=A0=A0 =A0=A0=A0 {<br>=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 $g_loger-&gt;info(sprin=
tf(&quot;inserted: %s&quot;, $l_i));<br>
=A0=A0=A0 =A0=A0=A0 };<br>=A0=A0=A0 };<br><br><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid=
 rgb(204, 204, 204); padding-left: 1ex;">
<br>
That&#39;s why I asked about the frequency. If you&#39;re doing a long-term=
<br>
stress test and seeing a 30 second pause once per week, that&#39;s a lot<br=
>
more likely to be &quot;normal&quot; for your workload than if you&#39;re s=
eeing it<br>
happen once ever three minutes. The issue is that if you want to fix<br>
your problem, one must first figure out what the problem *is*. Based<br>
on past mailing list traffic, it seems most people&#39;s problems that are<=
br>
seemingly &quot;due to GC&quot; end up being because of a too high live set=
 size<br>
or the CMS phase triggering too late. These are fixable issues if are<br>
running into them.<br>
<br></blockquote><div><br>In may case this happen from time to time. For ex=
ample insert all 5GB xml took about 30-40 minutes, and nodes frozen about 5=
-10 time on that period (avg time of frozen 15 secs)<br><br>=A0</div><block=
quote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left=
: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

If all you have is a single column family with a 64 mb flush threshold<br>
and doing a bunch of insertions, and have a heap size of 5 (was it?)<br>
gig, you should not be having these issues. But stating that helps no<br>
one, which is why I&#39;m asking for more information. (Widely<br>
extrapolating and suggesting that all Cassandra nodes will always<br>
freeze for 30 seconds every now and then is also helping no one, other<br>
than not being true.)<br>
<br></blockquote><div>=A0At initial state HEAP was 6GB. When i increase HEA=
P size to 7GB nodes frozen only one time, but took much greater time (40 se=
cs)<br>=A0</div><br></div><br>

--00504502cc39a8b51b049dfe41eb--