From user-return-64015-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Wed Jun 12 16:36:57 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7376918061A for ; Wed, 12 Jun 2019 18:36:57 +0200 (CEST) Received: (qmail 60481 invoked by uid 500); 12 Jun 2019 16:36:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 60471 invoked by uid 99); 12 Jun 2019 16:36:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Jun 2019 16:36:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BC7DD180C51 for ; Wed, 12 Jun 2019 16:36:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.8 X-Spam-Level: * X-Spam-Status: No, score=1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=smartthings.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 1yTN8EiUo-3P for ; Wed, 12 Jun 2019 16:36:49 +0000 (UTC) Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C003A5F22F for ; Wed, 12 Jun 2019 16:36:48 +0000 (UTC) Received: by mail-ed1-f45.google.com with SMTP id d4so3338985edr.13 for ; Wed, 12 Jun 2019 09:36:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=1WO+N9JrwAbd5/98cRAPHCMZYNntOLSUwRUuzfRsKPA=; b=bOF8Ev9uF17Sc0csnoZE1Mr/AJz/Fm8IJKjsjYJA0RibOxb1OcwP46yjaY6f2D9Nr3 I4cqIKSjawth5jR3FG613sFOpTZDru7zFzUTBhmzE8HY8NVVqt1TdRIkI/T7+m9w90x8 WLuEhBJHd0rPIgRNAhvaweBwjOOADfUjAfFCiCorljY6QKYBMuh8c0CS+oY2iV6htHtr mwcOOQTvgA4Ot/P9+QK82GYpJUDXxZFraYpLDByTDvwoOpt80QqCHqvvRNsi3n02V2QM QE6zEmWI+Gjcc1nTq6nt6ytPDUgrSRug8qLNgd9tKuumRO5zeVqbEpnhuS43vCmaMEJd qRCA== X-Gm-Message-State: APjAAAVrOu1CbczLeCI+RWhbnOY9rhs92osmWo+U+kSy2sLcDrydxmwu rvH3jdaNlhnl7VYcB2ekg3m4swZYNkUd1ISa9K+qCVoj8CM= X-Google-Smtp-Source: APXvYqzBzcZlKaZUf4JJoqiv66wDd6FTlTgL9YJz+b5afjwunQz2dYwvQBacVBIJuULQcOqH8m6WUHAFz9t74I5PNgc= X-Received: by 2002:a50:b7f8:: with SMTP id i53mr90512874ede.196.1560357408357; Wed, 12 Jun 2019 09:36:48 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Carl Mueller Date: Wed, 12 Jun 2019 11:36:37 -0500 Message-ID: Subject: Re: postmortem on 2.2.13 scale out difficulties To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="0000000000000c5632058b230519" --0000000000000c5632058b230519 Content-Type: text/plain; charset="UTF-8" We're getting DEBUG [GossipStage:1] 2019-06-12 15:20:07,797 MigrationManager.java:96 - Not pulling schema because versions match or shouldPullSchemaFrom returned false multiple times, as it contacts the nodes. On Wed, Jun 12, 2019 at 11:35 AM Carl Mueller wrote: > We only were able to scale out four nodes and then failures started > occurring, including multiple instances of nodes joining a cluster without > streaming. > > Sigh. > > On Tue, Jun 11, 2019 at 3:11 PM Carl Mueller > wrote: > >> We had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, >> IPV6 >> >> Needed to scale out the asia datacenter, which was 5 nodes, europe and us >> were 25 nodes >> >> We were running into bootstrapping issues where the new node failed to >> bootstrap/stream, it failed with >> >> "java.lang.RuntimeException: A node required to move the data >> consistently is down" >> >> ...even though they were all up based on nodetool status prior to adding >> the node. >> >> First we increased the phi_convict_threshold to 12, and that did not >> help. >> >> CASSANDRA-12281 appeared similar to what we had problems with, but I >> don't think we hit that. Somewhere in there someone wrote >> >> "For us, the workaround is either deleting the data (then bootstrap >> again), or increasing the ring_delay_ms. And the larger the cluster is, the >> longer ring_delay_ms is needed. Based on our tests, for a 40 nodes cluster, >> it requires ring_delay_ms to be >50seconds. For a 70 nodes cluster, >> >100seconds. Default is 30seconds." >> >> Given the WAN nature or our DCs, we used ring_delay_ms to 100 seconds and >> it finally worked. >> >> side note: >> >> During the rolling restarts for setting phi_convict_threshold we observed >> quite a lot of status map variance between nodes (we have a program to poll >> all of a datacenter or cluster's view of the gossipinfo and statuses. AWS >> appears to have variance in networking based on the phi_convict_threshold >> advice, I'm not sure if our difficulties were typical in that regard and/or >> if our IPV6 and/or globally distributed datacenters were exacerbating >> factors. >> >> We could not reproduce this in loadtest, although loadtest is only eu and >> us (but is IPV6) >> > --0000000000000c5632058b230519 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
We're getting

DEBUG [GossipStage:1] 2019-06-12= 15:20:07,797 MigrationManager.java:96 - Not pulling schema because version= s match or shouldPullSchemaFrom returned false

multiple times, as i= t contacts the nodes.=C2=A0

On Wed, Jun 12, 2019 at 11:35 AM Carl Muelle= r <carl.mueller@smartthi= ngs.com> wrote:
We only were able to scale out four nodes and then = failures started occurring, including multiple instances of nodes joining a= cluster without streaming.=C2=A0

Sigh.

<= div class=3D"gmail_quote">
On Tue, Jun= 11, 2019 at 3:11 PM Carl Mueller <carl.mueller@smartthings.com> wrote:
W= e had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, IPV6=

Needed to scale out the asia datacenter, which was 5 nodes, europe = and us were 25 nodes

We were running into bootstrapping issues where= the new node failed to bootstrap/stream, it failed with=C2=A0

"java.lang.RuntimeException: A node required to move the data c= onsistently is down"=C2=A0

...even though the= y were all up based on nodetool status prior to adding the node.

Fir= st we increased the phi_convict_threshold to 12, and that did not help.=C2= =A0

CASSANDRA-12281 appeared similar to what we had problems with, b= ut I don't think we hit that. Somewhere in there someone wrote

=
"For us, the workaround is either deleting the data (then b= ootstrap again), or increasing the ring_delay_ms. And the larger the cluste= r is, the longer ring_delay_ms is needed. Based on our tests, for a 40 node= s cluster, it requires ring_delay_ms to be >50seconds. For a 70 nodes cl= uster, >100seconds. Default is 30seconds."

Given = the WAN nature or our DCs, we used ring_delay_ms to 100 seconds and it fina= lly worked.

side note:

During the rolling restarts for settin= g phi_convict_threshold we observed quite a lot of status map variance betw= een nodes (we have a program to poll all of a datacenter or cluster's v= iew of the gossipinfo and statuses. AWS appears to have variance in network= ing based on the phi_convict_threshold advice, I'm not sure if our diff= iculties were typical in that regard and/or if our IPV6 and/or globally dis= tributed datacenters were exacerbating factors.=C2=A0=C2=A0

We could= not reproduce this in loadtest, although loadtest is only eu and us (but i= s IPV6)
--0000000000000c5632058b230519--