From user-return-63732-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Thu Apr 25 07:26:01 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 24EBF180638 for ; Thu, 25 Apr 2019 09:26:01 +0200 (CEST) Received: (qmail 47502 invoked by uid 500); 25 Apr 2019 07:25:56 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 47492 invoked by uid 99); 25 Apr 2019 07:25:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Apr 2019 07:25:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 178DBC2178 for ; Thu, 25 Apr 2019 07:25:56 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.8 X-Spam-Level: * X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id pozKyD31jcqt for ; Thu, 25 Apr 2019 07:25:54 +0000 (UTC) Received: from mail-oi1-f194.google.com (mail-oi1-f194.google.com [209.85.167.194]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 65B3F5F5E0 for ; Thu, 25 Apr 2019 07:25:53 +0000 (UTC) Received: by mail-oi1-f194.google.com with SMTP id k10so5883757oik.7 for ; Thu, 25 Apr 2019 00:25:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=48bUY8Cs9ohfwDTfjP8QQtA+YzPkwc6otrxxaIigehg=; b=rKlenMWEVeF8btshyasutR99dk8ab/PRWR9S6kF+E+QWXN6ob+95FekJEMl/oFgGdz 67QRRC8fQvi2saSJcpVVCIECwS2YG0ZMDxhizXdxekK3ysDeesXqqanC9sDdsTt4oBVK LEKesY3ouchodl25JIH/FP2CPNqxJA9L39Rsy1L4CaveT0ZH+Jg5W4ZTi65m+FkWGOrN +5sgbpxu9yKwc9GQ9UzpeMJ+kAFZ68PKs4CIsxOlwVYcGVyMTXFljbPNZuDwieb/fq4F XjedQxjhKE2XayDSwDeq+dyOxoiIcmI0EfesE38/sjhD6i/UDqXdelqsAuUKBKkmVHVB olNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=48bUY8Cs9ohfwDTfjP8QQtA+YzPkwc6otrxxaIigehg=; b=fnVf4ft6TKaB+ecGVOVR7fpHMskcUISdfeWI34ttOVg7qE+rqPlRMdhCa4z1DH4Djn mWgMdYGJRDl2p5LZNSm1tJWCnDxaT9vcYmT1XsgPFIfTJOBIDBByhl8RBFlEUXLucwTL eesHHvB8686v6cWQulpmwRPn2NWscDyMjmod/eJ4qTfjFHIDLKXEA6tnVjLFIoVTi8uo 6XLJlZzTh0ObpKhMVwQToqElDv5Sxxa1vdVAPEXgbWdHXQy3JrkF3RjZ/WfQhpGZGQsC bxZFk5tQS720CW/Gpvxmfl01v8tRMZEmSOwQtQuwXeLfyNyxhTaTckjd09exuXTR5/h6 gWeg== X-Gm-Message-State: APjAAAUtcEwARiJ6eJxOAcuRApV0GnxhLvu7IClfuDCIkf7gGClWqb8O r8JTW0voLDa6sN7vNqUQGkVx4w3sKhbslm9iR9Ec5Q== X-Google-Smtp-Source: APXvYqxwnfjsffdvsEq4imZngqJgBRj04ZYPraRCUDqp5GI+CVTvz7FqQFMfFG2CGP97l2Aq6JbZMsjNeRnk+bFyNfM= X-Received: by 2002:aca:5809:: with SMTP id m9mr2267793oib.88.1556177151817; Thu, 25 Apr 2019 00:25:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Hiroyuki Yamada Date: Thu, 25 Apr 2019 16:25:40 +0900 Message-ID: Subject: Re: A cluster (RF=3) not recovering after two nodes are stopped To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="000000000000577e6b058755bacd" --000000000000577e6b058755bacd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, Sorry again. We found yet another weird thing in this. If we stop nodes with systemctl or just kill (TERM), it causes the problem, but if we kill -9, it doesn't cause the problem. Thanks, Hiro On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada wrote= : > Sorry, I didn't write the version and the configurations. > I've tested with C* 3.11.4, and > the configurations are mostly set to default except for the replication > factor and listen_address for proper networking. > > Thanks, > Hiro > > On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada > wrote: > >> Hello Ben, >> >> Thank you for the quick reply. >> I haven't tried that case, but it does't recover even if I stopped the >> stress. >> >> Thanks, >> Hiro >> >> On Wed, Apr 24, 2019 at 3:36 PM Ben Slater >> wrote: >> >>> Is it possible that stress is overloading node 1 so it=E2=80=99s not re= covering >>> state properly when node 2 comes up? Have you tried running with a lowe= r >>> load (say 2 or 3 threads)? >>> >>> Cheers >>> Ben >>> >>> --- >>> >>> >>> *Ben Slater* >>> *Chief Product Officer* >>> >>> >>> >>> >>> >>> >>> Read our latest technical blog posts here >>> . >>> >>> This email has been sent on behalf of Instaclustr Pty. Limited >>> (Australia) and Instaclustr Inc (USA). >>> >>> This email and any attachments may contain confidential and legally >>> privileged information. If you are not the intended recipient, do not = copy >>> or disclose its content, but please reply to this email immediately and >>> highlight the error to the sender and then immediately delete the messa= ge. >>> >>> >>> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada >>> wrote: >>> >>>> Hello, >>>> >>>> I faced a weird issue when recovering a cluster after two nodes are >>>> stopped. >>>> It is easily reproduce-able and looks like a bug or an issue to fix, >>>> so let me write down the steps to reproduce. >>>> >>>> =3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D >>>> * Create a 3-node cluster with RF=3D3 >>>> - node1(seed), node2, node3 >>>> * Start requests to the cluster with cassandra-stress (it continues >>>> until the end) >>>> - what we did: cassandra-stress mixed cl=3DQUORUM duration=3D10m >>>> -errors ignore -node node1,node2,node3 -rate threads\>=3D16 >>>> threads\<=3D256 >>>> * Stop node3 normally (with systemctl stop) >>>> - the system is still available because the quorum of nodes is >>>> still available >>>> * Stop node2 normally (with systemctl stop) >>>> - the system is NOT available after it's stopped. >>>> - the client gets `UnavailableException: Not enough replicas >>>> available for query at consistency QUORUM` >>>> - the client gets errors right away (so few ms) >>>> - so far it's all expected >>>> * Wait for 1 mins >>>> * Bring up node2 >>>> - The issue happens here. >>>> - the client gets ReadTimeoutException` or WriteTimeoutException >>>> depending on if the request is read or write even after the node2 is >>>> up >>>> - the client gets errors after about 5000ms or 2000ms, which are >>>> request timeout for write and read request >>>> - what node1 reports with `nodetool status` and what node2 reports >>>> are not consistent. (node2 thinks node1 is down) >>>> - It takes very long time to recover from its state >>>> =3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D >>>> >>>> Is it supposed to happen ? >>>> If we don't start cassandra-stress, it's all fine. >>>> >>>> Some workarounds we found to recover the state are the followings: >>>> * Restarting node1 and it recovers its state right after it's restarte= d >>>> * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000 >>>> or something) >>>> >>>> I don't think either of them is a really good solution. >>>> Can anyone explain what is going on and what is the best way to make >>>> it not happen or recover ? >>>> >>>> Thanks, >>>> Hiro >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org >>>> For additional commands, e-mail: user-help@cassandra.apache.org >>>> >>>> --000000000000577e6b058755bacd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

Sorry again.
We found y= et another weird thing in this.
If we stop nodes with systemctl o= r just kill (TERM), it causes the problem,
but if we kill -9, it = doesn't cause the problem.

Thanks,
=
Hiro

On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada <<= a href=3D"mailto:mogwaing@gmail.com">mogwaing@gmail.com> wrote:
<= /div>
Sor= ry, I didn't write the version and the configurations.
I've tes= ted with C* 3.11.4, and=C2=A0
the configurations are mostly set t= o default except for the replication factor and listen_address for proper n= etworking.

Thanks,
Hiro

<= div class=3D"gmail_quote">
On Wed, Apr= 24, 2019 at 5:12 PM Hiroyuki Yamada <mogwaing@gmail.com> wrote:
H= ello Ben,

Thank you for the quick reply.
I hav= en't tried that case, but it does't recover even if I stopped the s= tress.

Thanks,
Hiro

On Wed, Apr 24,= 2019 at 3:36 PM Ben Slater <ben.slater@instaclustr.com> wrote:
Is it possi= ble that stress is overloading node 1 so it=E2=80=99s not recovering state = properly when node 2 comes up? Have you tried running with a lower load (sa= y 2 or 3 threads)?

Cheers
Ben

---=C2=A0

Ben Slater
Chief Product Officer


=C2=A0=C2=A0=C2=A0=C2=A0

Read our latest = technical blog posts=C2=A0here.

This email ha= s been sent on behalf of=C2=A0Instaclustr Pty. Limited (Australia) and=C2= =A0Instaclustr Inc (USA).

This email and any attachments may=C2=A0contain= confidential and legally privileged=C2=A0information.=C2=A0 If you are not= the intended=C2=A0recipient, do not copy or disclose its=C2=A0content, but= please reply to this email=C2=A0immediately and highlight the error to the= =C2=A0sender and then immediately delete the=C2=A0message.

=


On Wed, 24 Apr 20= 19 at 16:28, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
Hello,

I faced a weird issue when recovering a cluster after two nodes are stopped= .
It is easily reproduce-able and looks like a bug or an issue to fix,
so let me write down the steps to reproduce.

=3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D
* Create a 3-node cluster with RF=3D3
=C2=A0 =C2=A0- node1(seed), node2, node3
* Start requests to the cluster with cassandra-stress (it continues
until the end)
=C2=A0 =C2=A0- what we did: cassandra-stress mixed cl=3DQUORUM duration=3D1= 0m
-errors ignore -node node1,node2,node3 -rate threads\>=3D16
threads\<=3D256
* Stop node3 normally (with systemctl stop)
=C2=A0 =C2=A0- the system is still available because the quorum of nodes is=
still available
* Stop node2 normally (with systemctl stop)
=C2=A0 =C2=A0- the system is NOT available after it's stopped.
=C2=A0 =C2=A0- the client gets `UnavailableException: Not enough replicas available for query at consistency QUORUM`
=C2=A0 =C2=A0- the client gets errors right away (so few ms)
=C2=A0 =C2=A0- so far it's all expected
* Wait for 1 mins
* Bring up node2
=C2=A0 =C2=A0- The issue happens here.
=C2=A0 =C2=A0- the client gets ReadTimeoutException` or WriteTimeoutExcepti= on
depending on if the request is read or write even after the node2 is
up
=C2=A0 =C2=A0- the client gets errors after about 5000ms or 2000ms, which a= re
request timeout for write and read request
=C2=A0 =C2=A0- what node1 reports with `nodetool status` and what node2 rep= orts
are not consistent. (node2 thinks node1 is down)
=C2=A0 =C2=A0- It takes very long time to recover from its state
=3D=3D=3D STEPS TO REPRODUCE =3D=3D=3D

Is it supposed to happen ?
If we don't start cassandra-stress, it's all fine.

Some workarounds we found to recover the state are the followings:
* Restarting node1 and it recovers its state right after it's restarted=
* Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000
or something)

I don't think either of them is a really good solution.
Can anyone explain what is going on and what is the best way to make
it not happen or recover ?

Thanks,
Hiro

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

--000000000000577e6b058755bacd--