Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of tom@drillster.com designates
 209.85.128.53 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAJMN9iN0yvUctWuAQqY8O10n1atBn=DPwez0_ax5VhqvtjmzSQ@mail.gmail.com>
References: 
 <CAADG=Lg1Lgh3jCmanEO5anrqqZUWCy=S7YS+qKP-if1Va8eGwA@mail.gmail.com>
	<CAJMN9iM5MS-X5sKT0NWkaHR8tawXd32K4FKnbn1UD_fz_oH5cg@mail.gmail.com>
	<CAADG=Li=irR-iz2+GRCFbYop8PNOKujzA1HfHAEWP1RM7hCOvQ@mail.gmail.com>
	<CAJMN9iOmAuSRhum_CA+NfyASTOxmxzSfw+fBpsFT-75BUmpwSg@mail.gmail.com>
	<CAADG=LhDLH5Tyo_9TTdGTv_PSDUCE16xSDejhuun=SK3DK1DCw@mail.gmail.com>
	<CAJMN9iN0yvUctWuAQqY8O10n1atBn=DPwez0_ax5VhqvtjmzSQ@mail.gmail.com>
Date: Sat, 7 Dec 2013 14:28:59 +0100
Message-ID: 
 <CAADG=LhgLh9fBD-PmLdh1NOeLa3hU4BtQSxK+6aKZZ5U4c7MuA@mail.gmail.com>
Subject: Re: How to monitor the progress of a HintedHandoff task?
From: Tom van den Berge <tom@drillster.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a1133d6eed8c21604ecf1be19

--001a1133d6eed8c21604ecf1be19
Content-Type: text/plain; charset=ISO-8859-1

Rahul,

I've made some progress in my investigations in the mean time. It seems
that the network bandwidth to my remote data center is relatively small,
and at the same time my application generates far more write operations
that I was expecting, resulting in more replication data to the remote DC.

In the case of a network hickup, or a  sudden peek in data generated by my
application (or both), it seems that the network capacity to the remote DC
is simply not sufficient to keep up with the data. This results in the
hints piling up.

On top of that, my cassandra nodes are equipped with a moderate amount of
memory (4G). This might simply be not enough to keep maintain the hints and
other column families in memtables. When the problem occurs, I can see that
the node is very busy flushing the hint memtable to disk, which obviously
results in high CPU/IO load.

I've managed to significantly reduce the number of write/delete operations
from my application, which should greatly decrease the rate at which the
hints CF is growing in case of time outs to the remote DC. I'm also
planning to stick some more memory in the servers. Can you think of other
wise things I might have missed?

Thanks for your feedback -- it's highly appreciated!

Tom


On Fri, Dec 6, 2013 at 4:41 PM, Rahul Menon <rahul@apigee.com> wrote:

> Tom,
>
> you should look at phi_convict_threshold and try and increase the value if
> you have too much chatter on your network.
>
> Also, rebuilding the entire node because of a OOM does not make sense,
> could you please post the C* version that you are using & the head size you
> have configured?
>
> Thanks
> Rahul
>
>
> On Tue, Dec 3, 2013 at 7:41 PM, Tom van den Berge <tom@drillster.com>wrote:
>
>> Rahul,
>>
>> This problem occurs every now and then, and currently everything is ok,
>> so there are no hints. But whenever it happens, the hints are quickly
>> piling up. This results in heap problems on the node ("Heap is 0.813462
>> full..." appears many times). This in turn results in the flushing of the
>> 'hints' column family, to relieve memory pressure. According to the log
>> message, the size varies between 50 and 60MB). But since the
>> HintedHandoffManager is reading from the hints CF, it will probably pull it
>> back into a memtable again -- that's at least my understanding of how it
>> works.
>>
>> So I guess that flushing the hints CF while the HintedHandoffManager is
>> working on it only makes things worse, and it could be the reason that the
>> process never ends.
>>
>> What I typically see when this happens is that the hints keep piling up,
>> and eventually the node comes to a grinding halt (OOM). Then I have to
>> rebuild the node entirely (only removing the hints doesn't work).
>>
>> The reason for hints to start accumulating in the first place might be a
>> spike in CF writes that must be replicated to a node in another data
>> center. The available bandwidth to that data center might not be able to
>> handle the data quickly enough, resulting in stored hints. The
>> HintedHandoff task that is started is targeting that remote node.
>>
>>
>> Thanks,
>> Tom
>>
>>
>> On Tue, Dec 3, 2013 at 2:22 PM, Rahul Menon <rahul@apigee.com> wrote:
>>
>>> Tom,
>>>
>>> Do you know why these hints are piling up? What is the size of the hints
>>> cf?
>>>
>>> Thanks
>>> Rahul
>>>
>>>
>>> On Tue, Dec 3, 2013 at 6:41 PM, Tom van den Berge <tom@drillster.com>wrote:
>>>
>>>> Hi Rahul,
>>>>
>>>> Thanks for your reply.
>>>>
>>>> I have never seen message like "Timed out replaying hints to...", which
>>>> is a good thing then, I suppose ;)
>>>>
>>>> Normally, I do see the "Finished hinted handoff..." log message.
>>>> However, every now and then this message is not logged, not even after
>>>> several hours. This is the problem I'm trying to solve.
>>>>
>>>> The log messages you describe are quite course-grained; they only tell
>>>> you that a task has started or finished, but not how this task is
>>>> progressing. And that's exactly what I would like to know if I see that a
>>>> task has started, but has not finished after a reasonable amount of time.
>>>>
>>>> So I guess the only way to see learn the progress is to look inside the
>>>> 'hints' column family then.I'll give that a try.
>>>>
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>
>>>> On Tue, Dec 3, 2013 at 1:43 PM, Rahul Menon <rahul@apigee.com> wrote:
>>>>
>>>>> Tom,
>>>>>
>>>>> You should check the size of the hints column family to determine how
>>>>> much are present. The hints are a super column family and its keys are
>>>>> destination tokens. You could look at it if you would like.
>>>>>
>>>>> Hints send and timedouts are logged, you should be seeing something
>>>>> like
>>>>>
>>>>> Timed out replaying hints to {}; aborting ({} delivered
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> OR
>>>>>
>>>>> Finished hinted handoff of {} rows to endpoint {}
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Rahul
>>>>>
>>>>>
>>>>> On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge <tom@drillster.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there a way to monitor the progress of a hinted handoff task?
>>>>>>
>>>>>> I found the following two mbeans providing some info:
>>>>>>
>>>>>> org.apache.cassandra.internal:type=HintedHandoff, which tells me that
>>>>>> there is 1 active task, and
>>>>>> org.apache.cassandra.db:type=HintedHandoffManager#countPendingHints(),
>>>>>> which quite often gives a timeout when executed.
>>>>>>
>>>>>> Ideally, I would like to see how many hints have been sent (e.g. over
>>>>>> the last minute or so), and how many hints are still to be sent (although I
>>>>>> assume that's what countPendingHints normally does?)
>>>>>>
>>>>>> I'm experiencing hinted handoff tasks that are started, but never
>>>>>> finish, so I would like to know what the task is doing.
>>>>>>
>>>>>> My log shows this:
>>>>>>
>>>>>> INFO [HintedHandoff:1] 2013-12-02
>>>>>> 13:49:05,325 HintedHandOffManager.java (line 297) Started hinted handoff
>>>>>> for host: 6f80b942-5b6d-4233-9827-3727591abf55 with IP: /10.55.156.66
>>>>>> (nothing more for [HintedHandoff:1])
>>>>>>
>>>>>> The node is up and running, the network connection is ok, no gossip
>>>>>> messages appear in the logs.
>>>>>>
>>>>>> Any idea is welcome.
>>>>>> (Casandra 1.2.3)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Drillster BV
>>>>>> Middenburcht 136
>>>>>> 3452MT Vleuten
>>>>>> Netherlands
>>>>>>
>>>>>> +31 30 755 5330
>>>>>>
>>>>>> Open your free account at www.drillster.com
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Drillster BV
>>>> Middenburcht 136
>>>> 3452MT Vleuten
>>>> Netherlands
>>>>
>>>> +31 30 755 5330
>>>>
>>>> Open your free account at www.drillster.com
>>>>
>>>
>>>
>>
>>
>> --
>>
>> Drillster BV
>> Middenburcht 136
>> 3452MT Vleuten
>> Netherlands
>>
>> +31 30 755 5330
>>
>> Open your free account at www.drillster.com
>>
>
>


-- 

Drillster BV
Middenburcht 136
3452MT Vleuten
Netherlands

+31 30 755 5330

Open your free account at www.drillster.com

--001a1133d6eed8c21604ecf1be19
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Rahul,<div><br></div><div>I&#39;ve made some progress in m=
y investigations in the mean time. It seems that the network bandwidth to m=
y remote data center is relatively small, and at the same time my applicati=
on generates far more write operations that I was expecting, resulting in m=
ore replication data to the remote DC.</div>
<div><br></div><div>In the case of a network hickup, or a =A0sudden peek in=
 data generated by my application (or both), it seems that the network capa=
city to the remote DC is simply not sufficient to keep up with the data. Th=
is results in the hints piling up.</div>
<div><br></div><div>On top of that, my cassandra nodes are equipped with a =
moderate amount of memory (4G). This might simply be not enough to keep mai=
ntain the hints and other column families in memtables. When the problem oc=
curs, I can see that the node is very busy flushing the hint memtable to di=
sk, which obviously results in high CPU/IO load.</div>
<div><br></div><div>I&#39;ve managed to significantly reduce the number of =
write/delete operations from my application, which should greatly decrease =
the rate at which the hints CF is growing in case of time outs to the remot=
e DC. I&#39;m also planning to stick some more memory in the servers. Can y=
ou think of other wise things I might have missed?</div>
<div><br></div><div>Thanks for your feedback -- it&#39;s highly appreciated=
!</div><div><br></div><div>Tom</div></div><div class=3D"gmail_extra"><br><b=
r><div class=3D"gmail_quote">On Fri, Dec 6, 2013 at 4:41 PM, Rahul Menon <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:rahul@apigee.com" target=3D"_blank">r=
ahul@apigee.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_default=
" style=3D"font-family:verdana,sans-serif">Tom, <br><br></div><div class=3D=
"gmail_default" style=3D"font-family:verdana,sans-serif">
you should look at phi_convict_threshold and try and increase the value if =
you have too much chatter on your network. <br>

<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-se=
rif">Also, rebuilding the entire node because of a OOM does not make sense,=
 could you please post the C* version that you are using &amp; the head siz=
e you have configured?<br>


<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-se=
rif">Thanks<span class=3D"HOEnZb"><font color=3D"#888888"><br></font></span=
></div><span class=3D"HOEnZb"><font color=3D"#888888"><div class=3D"gmail_d=
efault" style=3D"font-family:verdana,sans-serif">
Rahul<br></div></font></span></div><div class=3D"HOEnZb"><div class=3D"h5">=
<div class=3D"gmail_extra">

<br><br><div class=3D"gmail_quote">On Tue, Dec 3, 2013 at 7:41 PM, Tom van =
den Berge <span dir=3D"ltr">&lt;<a href=3D"mailto:tom@drillster.com" target=
=3D"_blank">tom@drillster.com</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding=
-left:1ex">


<div dir=3D"ltr">Rahul,<div><br></div><div>This problem occurs every now an=
d then, and currently everything is ok, so there are no hints. But whenever=
 it happens, the hints are quickly piling up. This results in heap problems=
 on the node (&quot;Heap is 0.813462 full...&quot; appears many times). Thi=
s in turn results in the flushing of the &#39;hints&#39; column family, to =
relieve memory pressure.=A0According to the log message, the size varies be=
tween 50 and 60MB).=A0But since the HintedHandoffManager is reading from th=
e hints CF, it will probably pull it back into a memtable again -- that&#39=
;s at least my understanding of how it works.=A0</div>


<div><br></div><div>So I guess that flushing the hints CF while the HintedH=
andoffManager is working on it only makes things worse, and it could be the=
 reason that the process never ends.</div><div><br></div><div>What I typica=
lly see when this happens is that the hints keep piling up, and eventually =
the node comes to a grinding halt (OOM). Then I have to rebuild the node en=
tirely (only removing the hints doesn&#39;t work).</div>


<div><br></div><div>The reason for hints to start accumulating in the first=
 place might be a spike in CF writes that must be replicated to a node in a=
nother data center. The available bandwidth to that data center might not b=
e able to handle the data quickly enough, resulting in stored hints. The Hi=
ntedHandoff task that is started is targeting that remote node.</div>


<div><br></div><div><br></div><div>Thanks,</div><div>Tom</div></div><div><d=
iv><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue, De=
c 3, 2013 at 2:22 PM, Rahul Menon <span dir=3D"ltr">&lt;<a href=3D"mailto:r=
ahul@apigee.com" target=3D"_blank">rahul@apigee.com</a>&gt;</span> wrote:<b=
r>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_default=
" style=3D"font-family:verdana,sans-serif">Tom, <br><br></div><div class=3D=
"gmail_default" style=3D"font-family:verdana,sans-serif">


Do you know why these hints are piling up? What is the size of the hints cf=
?<br>

<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-se=
rif">Thanks<span><font color=3D"#888888"><br></font></span></div><span><fon=
t color=3D"#888888"><div class=3D"gmail_default" style=3D"font-family:verda=
na,sans-serif">


Rahul <br></div></font></span></div><div><div><div class=3D"gmail_extra">

<br><br><div class=3D"gmail_quote">On Tue, Dec 3, 2013 at 6:41 PM, Tom van =
den Berge <span dir=3D"ltr">&lt;<a href=3D"mailto:tom@drillster.com" target=
=3D"_blank">tom@drillster.com</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding=
-left:1ex">


<div dir=3D"ltr">Hi Rahul,<div><br></div><div>Thanks for your reply.</div><=
div><br></div><div>I have never seen message like &quot;Timed out replaying=
 hints to...&quot;, which is a good thing then, I suppose ;)</div><div><br>


</div><div>Normally, I do see the &quot;Finished hinted handoff...&quot; lo=
g message. However, every now and then this message is not logged, not even=
 after several hours. This is the problem I&#39;m trying to solve.</div>


<div><br></div><div>The log messages you describe are quite course-grained;=
 they only tell you that a task has started or finished, but not how this t=
ask is progressing. And that&#39;s exactly what I would like to know if I s=
ee that a task has started, but has not finished after a reasonable amount =
of time.</div>


<div><br></div><div>So I guess the only way to see learn the progress is to=
 look inside the &#39;hints&#39; column family then.I&#39;ll give that a tr=
y.</div><div><br></div><div><br></div><div>Thanks,</div><div>Tom</div>


</div><div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue, Dec 3=
, 2013 at 1:43 PM, Rahul Menon <span dir=3D"ltr">&lt;<a href=3D"mailto:rahu=
l@apigee.com" target=3D"_blank">rahul@apigee.com</a>&gt;</span> wrote:<br><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex">


<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:verdana,=
sans-serif">Tom, <br><br></div><div class=3D"gmail_default" style=3D"font-f=
amily:verdana,sans-serif">You should check the size of the hints column fam=
ily to determine how much are present. The hints are a super column family =
and its keys are destination tokens. You could look at it if you would like=
. <br>


<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-se=
rif">Hints send and timedouts are logged, you should be seeing something li=
ke <br><br><pre><span>Timed=A0out=A0replaying=A0hints=A0to=A0{};=A0aborting=
=A0({}=A0delivered<br>


</span></pre><pre><span><span style=3D"font-family:verdana,sans-serif">OR <=
/span><br><br></span><span>Finished=A0hinted=A0handoff=A0of=A0{}=A0rows=A0t=
o=A0endpoint=A0{}</span><br></pre><br></div><div class=3D"gmail_default" st=
yle=3D"font-family:verdana,sans-serif">


<br></div><div class=3D"gmail_default" style=3D"font-family:verdana,sans-se=
rif">Thanks<span><font color=3D"#888888"><br></font></span></div><span><fon=
t color=3D"#888888"><div class=3D"gmail_default" style=3D"font-family:verda=
na,sans-serif">


Rahul<br></div></font></span><div><div><div class=3D"gmail_extra"><br><br><=
div class=3D"gmail_quote">

On Tue, Dec 3, 2013 at 2:36 PM, Tom van den Berge <span dir=3D"ltr">&lt;<a =
href=3D"mailto:tom@drillster.com" target=3D"_blank">tom@drillster.com</a>&g=
t;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi,</div><div><br></di=
v><div>Is there a way to monitor the progress of a hinted handoff task?</di=
v>


<div><br></div><div>I found the following two mbeans providing some info:</=
div><div><br></div><div>org.apache.cassandra.internal:type=3DHintedHandoff,=
 which tells me that there is 1 active task, and<br>
</div><div>org.apache.cassandra.db:type=3DHintedHandoffManager#countPending=
Hints(), which quite often gives a timeout when executed.<br></div><div><br=
></div><div>Ideally, I would like to see how many hints have been sent (e.g=
. over the last minute or so), and how many hints are still to be sent (alt=
hough I assume that&#39;s what countPendingHints normally does?)</div>


<div><br></div><div>I&#39;m experiencing hinted handoff tasks that are star=
ted, but never finish, so I would like to know what the task is doing.</div=
><div><br></div><div>My log shows this:</div><div><br></div><div>INFO [Hint=
edHandoff:1] 2013-12-02 13:49:05,325=A0HintedHandOffManager.java (line 297)=
 Started hinted handoff for host: 6f80b942-5b6d-4233-9827-3727591abf55 with=
 IP: /<a href=3D"http://10.55.156.66" target=3D"_blank">10.55.156.66</a></d=
iv>


<div>(nothing more for [HintedHandoff:1])</div><div><br></div><div>The node=
 is up and running, the network connection is ok, no gossip messages appear=
 in the logs.</div><div><br></div><div>Any idea is welcome.</div><div>


(Casandra 1.2.3)</div><span><font color=3D"#888888">
<div><br></div><div><br></div><div><br></div><div><br></div>-- <br><div><im=
g><br></div><div><p>Drillster BV<br>Middenburcht 136<br>3452MT Vleuten<br>N=
etherlands</p><p>
<a style=3D"color:rgb(17,85,204)">+31 30 755 5330</a></p><p>Open your free =
account at=A0<a href=3D"http://www.drillster.com/" style=3D"color:rgb(17,85=
,204)" target=3D"_blank">www.drillster.com</a></p></div>
</font></span></div>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br><div><img><b=
r></div><div><p>Drillster BV<br>Middenburcht 136<br>3452MT Vleuten<br>Nethe=
rlands</p><p><a style=3D"color:rgb(17,85,204)">+31 30 755 5330</a></p>
<p>Open your free account at=A0<a href=3D"http://www.drillster.com/" style=
=3D"color:rgb(17,85,204)" target=3D"_blank">www.drillster.com</a></p></div>
</div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div><img><br></div><div><p>Drillster BV<br>Middenburcht 136<br>3452MT Vleu=
ten<br>Netherlands</p>
<p><a style=3D"color:rgb(17,85,204)">+31 30 755 5330</a></p><p>Open your fr=
ee account at=A0<a href=3D"http://www.drillster.com/" style=3D"color:rgb(17=
,85,204)" target=3D"_blank">www.drillster.com</a></p></div>
</div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div><img src=3D"http://www.drillster.com/media/logo-signature.png"><br></d=
iv><div><p>Drillster BV<br>Middenburcht 136<br>3452MT Vleuten<br>Netherland=
s</p>
<p><a style=3D"color:rgb(17,85,204)">+31 30 755 5330</a></p><p>Open your fr=
ee account at=A0<a href=3D"http://www.drillster.com/" style=3D"color:rgb(17=
,85,204)" target=3D"_blank">www.drillster.com</a></p></div>
</div>

--001a1133d6eed8c21604ecf1be19--