Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
MIME-Version: 1.0
In-Reply-To: <CADVoknz3gtxCNKLdUc57c5tV_YTMFfbhYT_dhtd293LiN0z07g@mail.gmail.com>
References: <CADVoknz3gtxCNKLdUc57c5tV_YTMFfbhYT_dhtd293LiN0z07g@mail.gmail.com>
From: Anthony Grasso <anthony.grasso@gmail.com>
Date: Mon, 1 May 2017 10:27:28 +1000
Message-ID: <CAGbQL+WrmgTX9mNS1Lp-RPio9V_FF5EaT7TAQXPJpUmnOw+qsQ@mail.gmail.com>
Subject: Re: Very slow cluster
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=94eb2c14bd966178e3054e6b7fd1
archived-at: Mon, 01 May 2017 00:28:22 -0000

--94eb2c14bd966178e3054e6b7fd1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Eduardo,

Please see my comment inline below regarding your third question.

Regards,
Anthony

On 28 April 2017 at 21:26, Eduardo Alonso <eduardoalonso@stratio.com> wrote=
:

> Hi to all:
>
> I am having some problems with two client's cassandra:3.0.8 clusters i
> want to share with you. These clusters are for QA and DEV.
>
> The cluster 1 (1 DC) is composed by 3 vm (heap=3D4G, RAM=3D8G) sharing th=
e
> same physical machine and sharing one ssd. I know this is not the best
> environment but it is only for testing purposes.
>
> The entire cluster runs very slow and sometimes have some failing inserts
> causing saving hints and replaying them and some data inconsistency with =
2i
> queries.
>
> I know it is not the best environment (virtual machines sharing physical
> machine and one physical disk) but it is very weird to me that just the
> same test case works like a charm in a 3 docker container inside my
> laptop(i7 16G ssd) but causes a lot of problems in their cluster.
>
> *listen_address* and *rpc_address* are set to external domain name (i. e:
> NODE_NAME.clientdomain.com). I have activated TRACE logs and get some
> strange messages
>
> So, my questions:
>
> *1.- It is posible that one node(with ) send a message to self triggering
> READ_REPAIR?*
>
> TRACE [SharedPool-Worker-1] 2017-04-24 08:58:28,558
> MessagingService.java:750 - Message-to-self TYPE:MUTATION VERB:READ_REPAI=
R going
> over MessagingService
>
>     TRACE [SharedPool-Worker-1] 2017-04-16 04:38:47,513
> MessagingService.java:747 -01a.clientdomain.com/10.63.24.238
> <http://qathcsdvm01c.ny3.corp.portware.net/10.63.24.238> sending
> READ_REPAIR to 3426@/10.63.24.238"
>
> *Does this log line shows one node asking itself for a portion of data
> that it has not? *
>
> *2.-* I have another suspicious log line about slow vms:
>
> -WARN  [GossipTasks:1] 2017-04-14 00:32:44,371 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 11195193520 > 5000000000
>
> *Does this line says that there is a pause in JVM  of 11 secs*? There is
> no garbage collector log lines. *Is it posible that this 11 secs pause is
> caused by a dns lookup of the domain?*
>
>
> *3.-* I know that listen_address must be the external IP (Inter node
> communications will be faster, no need to dns lookup)
>
> *If i set listen_address to external ip, is it necessary that ip be
> pingable from all the other datacenter nodes? *
> *Does inter-data-center communications use 'rpc_address' or
> 'listen_address'*?
>
>
All nodes in the cluster should be configured so that they can contact each
other. As far as being able to ping each other, enabling ICMP can be useful
for debugging inter communication problems.

Regarding internode communication; the *listen_address* is used for
internode communication in the cluster. Note that if you don't want to
manually specify an IP to *listen_address* for each node in your cluster,
leave it blank and Cassandra will use *InetAddress.getLocalHost()* to pick
an address.


> Thank you in advance
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Eduardo Alonso
> V=C3=ADa de las dos Castillas, 33, =C3=81tica 4, 3=C2=AA Planta
> 28224 Pozuelo de Alarc=C3=B3n, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // *@s=
tratiobd
> <https://twitter.com/StratioBD>*
>

--94eb2c14bd966178e3054e6b7fd1
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi Eduardo,<div><br></div><div>Please see my comment inlin=
e below regarding your third question.</div><div><br></div><div>Regards,</d=
iv><div>Anthony</div><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On 28 April 2017 at 21:26, Eduardo Alonso <span dir=3D"ltr">&lt;<a href=
=3D"mailto:eduardoalonso@stratio.com" target=3D"_blank">eduardoalonso@strat=
io.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"=
margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-lef=
t:1ex"><div dir=3D"ltr">Hi to all:<div><br></div><div>I am having some prob=
lems with two client&#39;s cassandra:3.0.8 clusters i want to share with yo=
u. These clusters are for QA and DEV.</div><div><br></div><div>The cluster =
1 (1 DC) is composed by 3 vm (heap=3D4G, RAM=3D8G) sharing the same physica=
l machine and sharing one ssd. I know this is not the best environment but =
it is only for testing purposes.=C2=A0</div><div><br></div><div>The entire =
cluster runs very slow and sometimes have some failing inserts causing savi=
ng hints and replaying them and some data inconsistency with 2i queries.<br=
></div><div><br></div><div>I know it is not the best environment (virtual m=
achines sharing physical machine and one physical disk) but it is very weir=
d to me that just the same test case works like a charm in a 3 docker conta=
iner inside my laptop(i7 16G ssd) but causes a lot of problems in their clu=
ster.=C2=A0</div><div><br></div><div><b>listen_address</b> and <b>rpc_addre=
ss</b> are set to external domain name (i. e:=C2=A0<span style=3D"font-size=
:12.8px"><a href=3D"http://NODE_NAME.clientdomain.com" target=3D"_blank">NO=
DE_NAME.clientdomain.com</a></span>)<wbr>. I have activated TRACE logs and =
get some strange messages</div><div><br></div><div>So, my questions:</div><=
div><br></div><div><b>1.- It is posible that one node(with ) send a message=
 to self triggering READ_REPAIR?</b></div><div><br></div><div><span style=
=3D"font-size:12.8px">TRACE [SharedPool-Worker-1] 2017-04-24 08:58:28,558 M=
essagingService.java:750 -=C2=A0</span><span style=3D"font-size:12.8px;back=
ground:yellow">Message-to-self TYPE:MUTATION VERB:<span class=3D"m_-7592651=
613046518874gmail-m_5565776882219905065gmail-il">READ_REPAIR</span>=C2=A0go=
ing over</span><span style=3D"font-size:12.8px">=C2=A0MessagingService</spa=
n><br></div><div><span style=3D"font-size:12.8px"><br></span></div><div>=C2=
=A0 =C2=A0 <span style=3D"font-size:12.8px">TRACE [SharedPool-Worker-1] 201=
7-04-16 04:38:47,513 MessagingService.java:747 -<a href=3D"http://01a.clien=
tdomain.com" target=3D"_blank">01a.clientdomain.com</a></span><a href=3D"ht=
tp://qathcsdvm01c.ny3.corp.portware.net/10.63.24.238" style=3D"font-size:12=
.8px" target=3D"_blank">/10.63.24<wbr>.238</a><span style=3D"font-size:12.8=
px">=C2=A0sending=C2=A0</span><span class=3D"m_-7592651613046518874gmail-m_=
5565776882219905065gmail-il" style=3D"font-size:12.8px">READ_REPAIR</span><=
span style=3D"font-size:12.8px">=C2=A0to 3426@/</span><a href=3D"http://10.=
63.24.238/" style=3D"font-size:12.8px" target=3D"_blank">10.63.24.238</a><s=
pan style=3D"font-size:12.8px">&quot;</span></div><div><span style=3D"font-=
size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px"><b>Does=
 this log line shows one node asking itself for a portion of data that it h=
as not?=C2=A0</b></span></div><div><span style=3D"font-size:12.8px"><br></s=
pan></div><div><span style=3D"font-size:12.8px"><b>2.-</b> I have another s=
uspicious log line about slow vms:=C2=A0</span></div><div><span style=3D"fo=
nt-size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px">-</s=
pan><span style=3D"font-size:12.8px">WARN =C2=A0[GossipTasks:1] 2017-04-14 =
00:32:44,371 FailureDetector.java:287 - Not marking nodes down due to local=
 pause of 11195193520 &gt; 5000000000</span></div><div><span style=3D"font-=
size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px"><b>Does=
 this line says that there is a pause in JVM =C2=A0of 11 secs</b>? There is=
 no garbage collector log lines. <b>Is it posible that this 11 secs pause i=
s caused by a dns lookup of the domain?</b></span></div><div><span style=3D=
"font-size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px"><=
br></span></div><div><span style=3D"font-size:12.8px"><b>3.-</b> I know tha=
t listen_address must be the external IP (Inter node communications will be=
 faster, no need to dns lookup)=C2=A0</span></div><div><span style=3D"font-=
size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px"><b>If i=
 set listen_address to external ip, is it necessary=C2=A0that ip be pingabl=
e from all the other datacenter nodes?=C2=A0</b></span></div><div><b style=
=3D"font-size:12.8px">Does inter-data-center communications use &#39;rpc_ad=
dress&#39; or &#39;listen_address&#39;</b><span style=3D"font-size:12.8px">=
?</span><br></div><div><span style=3D"font-size:12.8px"><br></span></div></=
div></blockquote><div><br></div><div>All nodes in the cluster should be con=
figured so that they can contact each other. As far as being able to ping e=
ach other, enabling ICMP can be useful for debugging inter communication pr=
oblems.</div><div><br></div><div>Regarding internode communication; the <b>=
<font face=3D"monospace, monospace">listen_address</font></b> is used for i=
nternode communication in the cluster. Note that if you don&#39;t want to m=
anually specify an IP to=C2=A0<b><font face=3D"monospace, monospace">listen=
_address</font></b> for each node in your cluster, leave it blank and Cassa=
ndra will use <font face=3D"monospace, monospace"><b>InetAddress.getLocalHo=
st()</b></font> to pick an address.</div><div>=C2=A0</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><span style=3D"font-=
size:12.8px"></span></div><div><span style=3D"font-size:12.8px">Thank you i=
n advance</span><span class=3D"m_-7592651613046518874gmail-HOEnZb"><font co=
lor=3D"#888888"><br></font></span></div><span class=3D"m_-75926516130465188=
74gmail-HOEnZb"><font color=3D"#888888"><div><span style=3D"font-size:12.8p=
x"><br></span></div><div><span style=3D"font-size:12.8px"><br></span></div>=
<div><span style=3D"font-size:12.8px"><br></span></div><div><span style=3D"=
font-size:12.8px"><br></span></div><div><br></div><div><span style=3D"font-=
size:12.8px"><br></span></div><div><br></div><div><br></div><div><br></div>=
<div><br></div><div><br></div><div><br></div><div><br></div><div><br></div>=
<div><br></div><div><br></div><div><br></div><div><span style=3D"color:rgb(=
153,153,153);font-family:tahoma,sans-serif"><br></span></div><div><span sty=
le=3D"color:rgb(153,153,153);font-family:tahoma,sans-serif"><br></span></di=
v><div><span style=3D"color:rgb(153,153,153);font-family:tahoma,sans-serif"=
>Eduardo Alonso</span><br></div><div><div><div class=3D"m_-7592651613046518=
874gmail-m_5565776882219905065gmail_signature"><div dir=3D"ltr"><div style=
=3D"font-size:12.8px;color:rgb(136,136,136)"><img src=3D"https://admin.goog=
le.com/a/cpanel/stratio.com/images/logo-custom.gif" width=3D"96" height=3D"=
39" style=3D"color:rgb(153,153,153);font-family:tahoma,sans-serif;font-size=
:x-small"></div><div style=3D"font-size:12.7273px;color:rgb(136,136,136)"><=
div style=3D"font-size:12.8px"><span style=3D"color:rgb(153,153,153);font-f=
amily:tahoma,sans-serif;font-size:x-small">V=C3=ADa de las dos Castillas, 3=
3, =C3=81tica 4, 3=C2=AA Planta</span><br></div><div style=3D"font-size:12.=
8px"><div style=3D"font-size:12.7273px;color:rgb(80,0,80)"><div dir=3D"ltr"=
><div style=3D"font-size:12.7273px"><font color=3D"#999999" face=3D"tahoma,=
 sans-serif" size=3D"1">28224 Pozuelo de Alarc=C3=B3n, Madrid<br></font></d=
iv><div><font size=3D"1"><font face=3D"tahoma, sans-serif"><font color=3D"#=
999999">Tel: <a href=3D"tel:+34%20918%2028%2064%2073" value=3D"+34918286473=
" target=3D"_blank">+34 91 828 6473</a> //=C2=A0<a href=3D"http://www.strat=
io.com/" style=3D"color:rgb(17,85,204)" target=3D"_blank">www.stratio.com</=
a>=C2=A0//=C2=A0</font></font><u><font style=3D"color:rgb(17,85,204)"><a hr=
ef=3D"https://twitter.com/StratioBD" style=3D"color:rgb(17,85,204)" target=
=3D"_blank">@stratio<wbr>bd</a></font></u></font></div></div></div></div></=
div></div></div></div>
</div></font></span></div>
</blockquote></div><br></div></div>

--94eb2c14bd966178e3054e6b7fd1--