From user-return-63732-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org  Thu Apr 25 07:26:01 2019
Return-Path: <user-return-63732-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 24EBF180638
	for <archive-asf-public@cust-asf.ponee.io>; Thu, 25 Apr 2019 09:26:01 +0200 (CEST)
Received: (qmail 47502 invoked by uid 500); 25 Apr 2019 07:25:56 -0000
Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@cassandra.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@cassandra.apache.org>
List-Post: <mailto:user@cassandra.apache.org>
List-Id: <user.cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Delivered-To: mailing list user@cassandra.apache.org
Received: (qmail 47492 invoked by uid 99); 25 Apr 2019 07:25:56 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Apr 2019 07:25:56 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 178DBC2178
	for <user@cassandra.apache.org>; Thu, 25 Apr 2019 07:25:56 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 1.8
X-Spam-Level: *
X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001,
	SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled
Authentication-Results: spamd1-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-eu.apache.org ([10.40.0.8])
	by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024)
	with ESMTP id pozKyD31jcqt for <user@cassandra.apache.org>;
	Thu, 25 Apr 2019 07:25:54 +0000 (UTC)
Received: from mail-oi1-f194.google.com (mail-oi1-f194.google.com [209.85.167.194])
	by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 65B3F5F5E0
	for <user@cassandra.apache.org>; Thu, 25 Apr 2019 07:25:53 +0000 (UTC)
Received: by mail-oi1-f194.google.com with SMTP id k10so5883757oik.7
        for <user@cassandra.apache.org>; Thu, 25 Apr 2019 00:25:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=48bUY8Cs9ohfwDTfjP8QQtA+YzPkwc6otrxxaIigehg=;
        b=rKlenMWEVeF8btshyasutR99dk8ab/PRWR9S6kF+E+QWXN6ob+95FekJEMl/oFgGdz
         67QRRC8fQvi2saSJcpVVCIECwS2YG0ZMDxhizXdxekK3ysDeesXqqanC9sDdsTt4oBVK
         LEKesY3ouchodl25JIH/FP2CPNqxJA9L39Rsy1L4CaveT0ZH+Jg5W4ZTi65m+FkWGOrN
         +5sgbpxu9yKwc9GQ9UzpeMJ+kAFZ68PKs4CIsxOlwVYcGVyMTXFljbPNZuDwieb/fq4F
         XjedQxjhKE2XayDSwDeq+dyOxoiIcmI0EfesE38/sjhD6i/UDqXdelqsAuUKBKkmVHVB
         olNw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=48bUY8Cs9ohfwDTfjP8QQtA+YzPkwc6otrxxaIigehg=;
        b=fnVf4ft6TKaB+ecGVOVR7fpHMskcUISdfeWI34ttOVg7qE+rqPlRMdhCa4z1DH4Djn
         mWgMdYGJRDl2p5LZNSm1tJWCnDxaT9vcYmT1XsgPFIfTJOBIDBByhl8RBFlEUXLucwTL
         eesHHvB8686v6cWQulpmwRPn2NWscDyMjmod/eJ4qTfjFHIDLKXEA6tnVjLFIoVTi8uo
         6XLJlZzTh0ObpKhMVwQToqElDv5Sxxa1vdVAPEXgbWdHXQy3JrkF3RjZ/WfQhpGZGQsC
         bxZFk5tQS720CW/Gpvxmfl01v8tRMZEmSOwQtQuwXeLfyNyxhTaTckjd09exuXTR5/h6
         gWeg==
X-Gm-Message-State: APjAAAUtcEwARiJ6eJxOAcuRApV0GnxhLvu7IClfuDCIkf7gGClWqb8O
	r8JTW0voLDa6sN7vNqUQGkVx4w3sKhbslm9iR9Ec5Q==
X-Google-Smtp-Source: APXvYqxwnfjsffdvsEq4imZngqJgBRj04ZYPraRCUDqp5GI+CVTvz7FqQFMfFG2CGP97l2Aq6JbZMsjNeRnk+bFyNfM=
X-Received: by 2002:aca:5809:: with SMTP id m9mr2267793oib.88.1556177151817;
 Thu, 25 Apr 2019 00:25:51 -0700 (PDT)
MIME-Version: 1.0
References: <CAPDOW742Dpxde67=6FPxcjdGUZmwo83vn5mJi6GvmJCSxheipA@mail.gmail.com>
 <CAKgYGarWYg=ghsky6et-jk9QBHgkGwX=amQbNaoCLSzD_SDWvw@mail.gmail.com>
 <CAPDOW76XPGeh5Cbqq3JXsK=KVTid1ib2nz8wueJR6yt+OGoKdQ@mail.gmail.com> <CAPDOW760WLKPTug_9A+0y2h-XSD6dJbvZEh-5LPHzfn4NAsjAw@mail.gmail.com>
In-Reply-To: <CAPDOW760WLKPTug_9A+0y2h-XSD6dJbvZEh-5LPHzfn4NAsjAw@mail.gmail.com>
From: Hiroyuki Yamada <mogwaing@gmail.com>
Date: Thu, 25 Apr 2019 16:25:40 +0900
Message-ID: <CAPDOW74NbQ1RR5Qaq+fttKug_tde4s_qOJrhcDJ5QKAF12oGeA@mail.gmail.com>
Subject: Re: A cluster (RF=3) not recovering after two nodes are stopped
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary="000000000000577e6b058755bacd"

--000000000000577e6b058755bacd
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hello,

Sorry again.
We found yet another weird thing in this.
If we stop nodes with systemctl or just kill (TERM), it causes the problem,
but if we kill -9, it doesn't cause the problem.

Thanks,
Hiro

On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada <mogwaing@gmail.com> wrote=
:

> Sorry, I didn't write the version and the configurations.
> I've tested with C* 3.11.4, and
> the configurations are mostly set to default except for the replication
> factor and listen_address for proper networking.
>
> Thanks,
> Hiro
>
> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada <mogwaing@gmail.com>
> wrote:
>
>> Hello Ben,
>>
>> Thank you for the quick reply.
>> I haven't tried that case, but it does't recover even if I stopped the
>> stress.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater <ben.slater@instaclustr.com>
>> wrote:
>>
>>> Is it possible that stress is overloading node 1 so it=E2=80=99s not re=
covering
>>> state properly when node 2 comes up? Have you tried running with a lowe=
r
>>> load (say 2 or 3 threads)?
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater*
>>> *Chief Product Officer*
>>>
>>>
>>> <https://www.facebook.com/instaclustr>
>>> <https://twitter.com/instaclustr>
>>> <https://www.linkedin.com/company/instaclustr>
>>>
>>> Read our latest technical blog posts here
>>> <https://www.instaclustr.com/blog/>.
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not =
copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the messa=
ge.
>>>
>>>
>>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada <mogwaing@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I faced a weird issue when recovering a cluster after two nodes are
>>>> stopped.
>>>> It is easily reproduce-able and looks like a bug or an issue to fix,
>>>> so let me write down the steps to reproduce.
>>>>
>>>> =3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D
>>>> * Create a 3-node cluster with RF=3D3
>>>>    - node1(seed), node2, node3
>>>> * Start requests to the cluster with cassandra-stress (it continues
>>>> until the end)
>>>>    - what we did: cassandra-stress mixed cl=3DQUORUM duration=3D10m
>>>> -errors ignore -node node1,node2,node3 -rate threads\>=3D16
>>>> threads\<=3D256
>>>> * Stop node3 normally (with systemctl stop)
>>>>    - the system is still available because the quorum of nodes is
>>>> still available
>>>> * Stop node2 normally (with systemctl stop)
>>>>    - the system is NOT available after it's stopped.
>>>>    - the client gets `UnavailableException: Not enough replicas
>>>> available for query at consistency QUORUM`
>>>>    - the client gets errors right away (so few ms)
>>>>    - so far it's all expected
>>>> * Wait for 1 mins
>>>> * Bring up node2
>>>>    - The issue happens here.
>>>>    - the client gets ReadTimeoutException` or WriteTimeoutException
>>>> depending on if the request is read or write even after the node2 is
>>>> up
>>>>    - the client gets errors after about 5000ms or 2000ms, which are
>>>> request timeout for write and read request
>>>>    - what node1 reports with `nodetool status` and what node2 reports
>>>> are not consistent. (node2 thinks node1 is down)
>>>>    - It takes very long time to recover from its state
>>>> =3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D
>>>>
>>>> Is it supposed to happen ?
>>>> If we don't start cassandra-stress, it's all fine.
>>>>
>>>> Some workarounds we found to recover the state are the followings:
>>>> * Restarting node1 and it recovers its state right after it's restarte=
d
>>>> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000
>>>> or something)
>>>>
>>>> I don't think either of them is a really good solution.
>>>> Can anyone explain what is going on and what is the best way to make
>>>> it not happen or recover ?
>>>>
>>>> Thanks,
>>>> Hiro
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>>
>>>>

--000000000000577e6b058755bacd
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello,<div><br></div><div>Sorry again.<div><div>We found y=
et another weird thing in this.</div><div>If we stop nodes with systemctl o=
r just kill (TERM), it causes the problem,</div><div>but if we kill -9, it =
doesn&#39;t cause the problem.</div></div><div><br></div><div>Thanks,</div>=
<div>Hiro</div></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" =
class=3D"gmail_attr">On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada &lt;<=
a href=3D"mailto:mogwaing@gmail.com">mogwaing@gmail.com</a>&gt; wrote:<br><=
/div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;bo=
rder-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Sor=
ry, I didn&#39;t write the version and the configurations.<div>I&#39;ve tes=
ted with C* 3.11.4, and=C2=A0</div><div>the configurations are mostly set t=
o default except for the replication factor and listen_address for proper n=
etworking.</div><div><br></div><div>Thanks,</div><div>Hiro</div></div><br><=
div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Apr=
 24, 2019 at 5:12 PM Hiroyuki Yamada &lt;<a href=3D"mailto:mogwaing@gmail.c=
om" target=3D"_blank">mogwaing@gmail.com</a>&gt; wrote:<br></div><blockquot=
e class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px s=
olid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr">H=
ello Ben,<div><br></div><div>Thank you for the quick reply.</div><div>I hav=
en&#39;t tried that case, but it does&#39;t recover even if I stopped the s=
tress.</div><div><br></div><div>Thanks,</div><div>Hiro</div></div><br><div =
class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, Apr 24,=
 2019 at 3:36 PM Ben Slater &lt;<a href=3D"mailto:ben.slater@instaclustr.co=
m" target=3D"_blank">ben.slater@instaclustr.com</a>&gt; wrote:<br></div><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef=
t:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Is it possi=
ble that stress is overloading node 1 so it=E2=80=99s not recovering state =
properly when node 2 comes up? Have you tried running with a lower load (sa=
y 2 or 3 threads)?<div><br></div><div>Cheers</div><div>Ben<br clear=3D"all"=
><div><div dir=3D"ltr" class=3D"gmail-m_5331635266384779300gmail-m_-4893606=
668077726081gmail-m_6460659704569190709gmail_signature"><div dir=3D"ltr"><d=
iv><div dir=3D"ltr"><div><div dir=3D"ltr"><div dir=3D"ltr"><p style=3D"marg=
in:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-seri=
f;font-size:14px">---=C2=A0</p><p style=3D"margin:10px 0px 0px;padding:0px;=
color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px"><strong><s=
pan style=3D"color:rgb(34,34,34)">Ben Slater<br></span></strong><em><span s=
tyle=3D"color:rgb(34,34,34)">Chief Product Officer<br></span></em></p><p st=
yle=3D"margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Aria=
l,sans-serif;font-size:14px"><em><span style=3D"color:rgb(34,34,34)"><img s=
rc=3D"https://docs.google.com/uc?export=3Ddownload&amp;id=3D1KsSdSa-ucqhGi8=
XrGcV-a7nIx0TcFFGK&amp;revid=3D0B-NJl5XxcJATV2ZkYzcveEFjcEhPRGZnS2FuQWtrWXV=
UNXFjPQ" width=3D"420" height=3D"86"><br></span></em></p><div style=3D"colo=
r:rgb(33,33,33);font-family:&quot;Helvetica Neue&quot;,Helvetica,Arial,sans=
-serif;font-size:13px"><p style=3D"margin:10px 0px 0px;padding:0px;color:rg=
b(51,51,51);font-family:Arial,sans-serif;font-size:14px"><a href=3D"https:/=
/www.facebook.com/instaclustr" rel=3D"nofollow" style=3D"color:rgb(53,114,1=
76);text-decoration:none" target=3D"_blank"><span style=3D"display:inline-b=
lock;max-width:100%"><img height=3D"25" width=3D"25" src=3D"http://cdn2.hub=
spot.net/hubfs/184235/dev_images/signature_app/facebook_sig.png" style=3D"m=
argin: 0px 2px; padding: 0px; border: 0px; display: block;"></span></a><spa=
n style=3D"color:rgb(85,85,85)">=C2=A0=C2=A0</span><a href=3D"https://twitt=
er.com/instaclustr" rel=3D"nofollow" style=3D"color:rgb(53,114,176);text-de=
coration:none" target=3D"_blank"><span style=3D"display:inline-block;max-wi=
dth:100%"><img height=3D"25" width=3D"25" src=3D"http://cdn2.hubspot.net/hu=
bfs/184235/dev_images/signature_app/twitter_sig.png" style=3D"margin: 0px 2=
px; padding: 0px; border: 0px; display: block;"></span></a><span style=3D"c=
olor:rgb(85,85,85)">=C2=A0=C2=A0</span><a href=3D"https://www.linkedin.com/=
company/instaclustr" rel=3D"nofollow" style=3D"color:rgb(53,114,176);text-d=
ecoration:none" target=3D"_blank"><span style=3D"display:inline-block;max-w=
idth:100%"><img height=3D"25" width=3D"25" src=3D"http://cdn2.hubspot.net/h=
ubfs/184235/dev_images/signature_app/linkedin_sig.png" style=3D"margin: 0px=
 2px; padding: 0px; border: 0px; display: block;"></span></a></p><p style=
=3D"margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,s=
ans-serif;font-size:14px"><span style=3D"color:rgb(0,0,0)">Read our latest =
technical blog posts=C2=A0</span><a href=3D"https://www.instaclustr.com/blo=
g/" rel=3D"nofollow" style=3D"color:rgb(53,114,176);text-decoration:none" t=
arget=3D"_blank">here</a><span style=3D"color:rgb(0,0,0)">.</span></p><p st=
yle=3D"margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Aria=
l,sans-serif;font-size:14px"><span style=3D"color:rgb(0,0,0)">This email ha=
s been sent on behalf of=C2=A0Instaclustr Pty. Limited (Australia) and=C2=
=A0Instaclustr Inc (USA).</span></p><p style=3D"margin:10px 0px 0px;padding=
:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px"><span=
 style=3D"color:rgb(0,0,0)">This email and any attachments may=C2=A0contain=
 confidential and legally privileged=C2=A0information.=C2=A0 If you are not=
 the intended=C2=A0recipient, do not copy or disclose its=C2=A0content, but=
 please reply to this email=C2=A0immediately and highlight the error to the=
=C2=A0sender and then immediately delete the=C2=A0message.</span></p></div>=
</div></div></div></div></div></div></div></div><br></div></div><br><div cl=
ass=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, 24 Apr 20=
19 at 16:28, Hiroyuki Yamada &lt;<a href=3D"mailto:mogwaing@gmail.com" targ=
et=3D"_blank">mogwaing@gmail.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex">Hello,<br>
<br>
I faced a weird issue when recovering a cluster after two nodes are stopped=
.<br>
It is easily reproduce-able and looks like a bug or an issue to fix,<br>
so let me write down the steps to reproduce.<br>
<br>
=3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D<br>
* Create a 3-node cluster with RF=3D3<br>
=C2=A0 =C2=A0- node1(seed), node2, node3<br>
* Start requests to the cluster with cassandra-stress (it continues<br>
until the end)<br>
=C2=A0 =C2=A0- what we did: cassandra-stress mixed cl=3DQUORUM duration=3D1=
0m<br>
-errors ignore -node node1,node2,node3 -rate threads\&gt;=3D16<br>
threads\&lt;=3D256<br>
* Stop node3 normally (with systemctl stop)<br>
=C2=A0 =C2=A0- the system is still available because the quorum of nodes is=
<br>
still available<br>
* Stop node2 normally (with systemctl stop)<br>
=C2=A0 =C2=A0- the system is NOT available after it&#39;s stopped.<br>
=C2=A0 =C2=A0- the client gets `UnavailableException: Not enough replicas<b=
r>
available for query at consistency QUORUM`<br>
=C2=A0 =C2=A0- the client gets errors right away (so few ms)<br>
=C2=A0 =C2=A0- so far it&#39;s all expected<br>
* Wait for 1 mins<br>
* Bring up node2<br>
=C2=A0 =C2=A0- The issue happens here.<br>
=C2=A0 =C2=A0- the client gets ReadTimeoutException` or WriteTimeoutExcepti=
on<br>
depending on if the request is read or write even after the node2 is<br>
up<br>
=C2=A0 =C2=A0- the client gets errors after about 5000ms or 2000ms, which a=
re<br>
request timeout for write and read request<br>
=C2=A0 =C2=A0- what node1 reports with `nodetool status` and what node2 rep=
orts<br>
are not consistent. (node2 thinks node1 is down)<br>
=C2=A0 =C2=A0- It takes very long time to recover from its state<br>
=3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D<br>
<br>
Is it supposed to happen ?<br>
If we don&#39;t start cassandra-stress, it&#39;s all fine.<br>
<br>
Some workarounds we found to recover the state are the followings:<br>
* Restarting node1 and it recovers its state right after it&#39;s restarted=
<br>
* Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000<br>
or something)<br>
<br>
I don&#39;t think either of them is a really good solution.<br>
Can anyone explain what is going on and what is the best way to make<br>
it not happen or recover ?<br>
<br>
Thanks,<br>
Hiro<br>
<br>
---------------------------------------------------------------------<br>
To unsubscribe, e-mail: <a href=3D"mailto:user-unsubscribe@cassandra.apache=
.org" target=3D"_blank">user-unsubscribe@cassandra.apache.org</a><br>
For additional commands, e-mail: <a href=3D"mailto:user-help@cassandra.apac=
he.org" target=3D"_blank">user-help@cassandra.apache.org</a><br>
<br>
</blockquote></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div>

--000000000000577e6b058755bacd--