From user-return-64797-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org  Sat Nov 30 04:58:13 2019
Return-Path: <user-return-64797-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 21035180657
	for <archive-asf-public@cust-asf.ponee.io>; Sat, 30 Nov 2019 05:58:13 +0100 (CET)
Received: (qmail 17294 invoked by uid 500); 30 Nov 2019 04:58:08 -0000
Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@cassandra.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@cassandra.apache.org>
List-Post: <mailto:user@cassandra.apache.org>
List-Id: <user.cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Delivered-To: mailing list user@cassandra.apache.org
Received: (qmail 17281 invoked by uid 99); 30 Nov 2019 04:58:08 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Nov 2019 04:58:08 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id F21461A410D
	for <user@cassandra.apache.org>; Sat, 30 Nov 2019 04:58:07 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 1.25
X-Spam-Level: *
X-Spam-Status: No, score=1.25 tagged_above=-999 required=6.31
	tests=[AC_HTML_NONSENSE_TAGS=1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
	DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
	FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=0.2,
	RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001,
	SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001]
	autolearn=disabled
Authentication-Results: spamd2-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-ec2-va.apache.org ([10.40.0.8])
	by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024)
	with ESMTP id FZD1bd1csm5G for <user@cassandra.apache.org>;
	Sat, 30 Nov 2019 04:58:05 +0000 (UTC)
Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.222.171; helo=mail-qk1-f171.google.com; envelope-from=shishirroy2000@gmail.com; receiver=<UNKNOWN> 
Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171])
	by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 80B0EBC530
	for <user@cassandra.apache.org>; Sat, 30 Nov 2019 04:58:05 +0000 (UTC)
Received: by mail-qk1-f171.google.com with SMTP id q28so8461659qkn.10
        for <user@cassandra.apache.org>; Fri, 29 Nov 2019 20:58:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=gqrIWFk9M9Ocm9pdG37gYrcZ2DTkPdwO05hUblGivyk=;
        b=IXcGouuNr47Rl5JFTNi/07V4ZBFED8qECukz5JeVLD4hBp3J4sbetnEVtjo6t3NRqk
         qaC+H3oXEUQow46xrlxyFYnFS3n0yuWP4zzI8vJzgRBy9kzG4sQuoskhDeOF8GJ00LO/
         Umkz44HJ8B5iR40j6Durc04lDEJqbO3z5VDxxa7mEGS6Oz7V/knfj29+tBVZZkKSMsKQ
         B2DkEVS8HN8Oc7RqZHMPJGUECFzKSEcIes7vrdVj3xZL/u09Vq+TRO2SXFbr28evI6GS
         /1DzJ8Gi8tMwlQuCfViqP9Vv/OfsyTQZFAhaSHymlOZ+1ObUzWG3+ebjSft3VFJGABo+
         3j0Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=gqrIWFk9M9Ocm9pdG37gYrcZ2DTkPdwO05hUblGivyk=;
        b=dGnTOVojvrm2BYwgjsWJY+/gl7HdVVYTzuGvkrhiYtC8gbuGTOBsWrS4gERqBYNFHn
         UvS9lL7qBUB8WHfHdDCWsLIXVYEnkXynZNsAhySH0eZ5SmYpm47IHu7zvQUK59uqhmJ6
         tEwO7Mb5Q8ut3pLOL4cBMtnMyaa8pWScW9VujfXRIMSxVmpFrZPK0wtpfEkcnmmdmaCP
         0/tMyHd74auPPMws5BBskmMbhh9tAO4J3ZPK2Q7T8fXsQfd58ohpjzo07s/yuJjQbNiT
         1ky0emfnXGubsR4lQTY2JefcKvUmeNXnEBVXSAyOF33vl5T3InM7uJWl4DDcRJsSViC0
         kWrg==
X-Gm-Message-State: APjAAAUFtH2Owvub4TAavpPpjP/d6EbGhDEtGc4qWtuf+k0dKdImZM9t
	sK8l0e9Xqe86x/S10U7gFLJUq0MMVvIhTM97sjpeQ3CZafQ=
X-Google-Smtp-Source: APXvYqznP3tYcr6Bdlwe4OOmmspQ11kwo9oGQHgdWNQah31Fac6NrpD52+dS3C+t0GKeh5LZjtO/KgaW0fjA4aIMDAI=
X-Received: by 2002:a37:5fc2:: with SMTP id t185mr10611128qkb.271.1575089884565;
 Fri, 29 Nov 2019 20:58:04 -0800 (PST)
MIME-Version: 1.0
References: <CAMkpk5rX422iqGPhG89AyaUGx4+=QuXFWuGBSV2gWCLTP7awEg@mail.gmail.com>
 <CAO5YkUS8NQEe=y2hB_XVBB8oq=CphfLCEt-ziNVY1_H-rkRszw@mail.gmail.com>
In-Reply-To: <CAO5YkUS8NQEe=y2hB_XVBB8oq=CphfLCEt-ziNVY1_H-rkRszw@mail.gmail.com>
From: Shishir Kumar <shishirroy2000@gmail.com>
Date: Sat, 30 Nov 2019 10:27:52 +0530
Message-ID: <CAMkpk5pbNgWnYUcd3QW9x-wGw5yfJgJN1nrffeOzTDK8yi=NeQ@mail.gmail.com>
Subject: Re: Upgrade strategy for high number of nodes
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary="0000000000000f17a205988931e9"

--0000000000000f17a205988931e9
Content-Type: text/plain; charset="UTF-8"

Some more background. We are planning (tested) binary upgrade across all
nodes without downtime. As next step running upgradesstables. As C*
file format and version (from format big, version mc to format bti, version
aa (Refer
https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgrade.html
- upgrade from DSE 5.1 to 6.x). Underlying changes explains why it takes
too much time to upgrade.
Running  upgradesstables  in parallel across RAC - This is where I am not
sure on impact of running in parallel (document recommends to run one node
at time). During upgradesstables there are scenario's where it report file
corruption, hence require corrective step I.e. scrub. Due to file
corruption at times nodes goes down due to sstable corruption or result in
high CPU usage ~100%. Performing above in parallel *without downtime* might
result in more inconsistency across nodes. This scenario have not tested,
so will need group help in case they have done similar upgrade in
past (I.e. scenario's/complexity which needs to be considered and why
guideline recommend to run upgradesstable one node at time).
-Shishir

On Fri, Nov 29, 2019 at 11:52 PM Josh Snyder <josh@code406.com> wrote:

> Hello Shishir,
>
> It shouldn't be necessary to take downtime to perform upgrades of a
> Cassandra cluster. It sounds like the biggest issue you're facing is the
> upgradesstables step. upgradesstables is not strictly necessary before a
> Cassandra node re-enters the cluster to serve traffic; in my experience it
> is purely for optimizing the performance of the database once the software
> upgrade is complete. I recommend trying out an upgrade in a test
> environment without using upgradesstables, which should bring the 5 hours
> per node down to just a few minutes.
>
> If you're running NetworkTopologyStrategy and you want to optimize
> further, you could consider performing the upgrade on multiple nodes within
> the same rack in parallel. When correctly configured,
> NetworkTopologyStrategy can protect your database from an outage of an
> entire rack. So performing an upgrade on a few nodes at a time within a
> rack is the same as a partial rack outage, from the database's perspective.
>
> Have a nice upgrade!
>
> Josh
>
> On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar <shishirroy2000@gmail.com>
> wrote:
>
>> Hi,
>>
>> Need input on cassandra upgrade strategy for below:
>> 1. We have Datacenter across 4 geography (multiple isolated deployments
>> in each DC).
>> 2. Number of Cassandra nodes in each deployment is between 6 to 24
>> 3. Data volume on each nodes between 150 to 400 GB
>> 4. All production environment has DR set up
>> 5. During upgrade we do not want downtime
>>
>> We are planning to go for stack upgrade but upgradesstables is taking
>> approx. 5 hours per node (if data volume is approx 200 GB).
>> Options-
>> No downtime - As per recommendation (DataStax documentation) if we plan
>> to upgrade one node at time I.e. in sequence upgrade cycle for one
>> environment will take weeks, so DevOps concern.
>> Read Only (No downtime) - Route read only load to DR system. We have
>> resilience built up to take care of mutation scenarios. But incase it takes
>> more than say 3-4 hours, there will be long catch up exercise. Maintenance
>> cost seems too high due to unknowns
>> Downtime- Can upgrade all nodes in parallel as no live customers. This
>> has direct Customer impact, so need to convince on maintenance cost vs
>> customer impact.
>> Please suggest how other Organisation are solving this scenario (whom
>> have 100+ nodes)
>>
>> Regards
>> Shishir
>>
>>

--0000000000000f17a205988931e9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div di=
r=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><div>Some more background.=C2=
=A0We are=C2=A0planning (tested)=C2=A0binary upgrade across all nodes witho=
ut downtime. As next step=C2=A0running upgradesstables. As=C2=A0<span><span=
>C* file=C2=A0format=C2=A0and version=C2=A0(from format big, version mc to =
format bti, version aa (Refer=C2=A0<a href=3D"https://docs.datastax.com/en/=
dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgra=
de.html">https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise=
/tools/toolsSStables/ToolsSSTableupgrade.html</a> - upgrade from=C2=A0DSE 5=
.1 to 6.x). <span><span>Underlying changes=C2=A0explains why it takes too m=
uch time to upgrade.=C2=A0</span></span></span></span></div><div><span><spa=
n><span><span>Running=C2=A0=C2=A0upgradesstables<span><span>=C2=A0=C2=A0in =
parallel across RAC - This is where I am not sure on impact of running in p=
arallel (document recommends to run one node at time). During upgradesstabl=
es there are=C2=A0scenario&#39;s where it report file corruption, hence=C2=
=A0require corrective step I.e. scrub.=C2=A0Due to file corruption at=C2=A0=
times nodes goes down due to sstable corruption or result in high CPU usage=
 ~100%.=C2=A0Performing above in parallel <u>without downtime</u> might res=
ult in=C2=A0more inconsistency across nodes. This scenario have not tested,=
 so=C2=A0will need=C2=A0group help in case they have done similar upgrade i=
n past=C2=A0(I.e. scenario&#39;s/complexity which=C2=A0needs to be consider=
ed=C2=A0and why guideline recommend to run upgradesstable one node at time)=
.</span></span></span></span></span></span></div><span><span><span><span><s=
pan><span></span></span></span></span></span></span></div><div dir=3D"ltr">=
<span><span><span><span><span><span><span><span><span><span><span><span><sp=
an><span><div>-Shishir<br></div></span></span></span></span></span></span><=
/span></span></span></span></span></span></span></span><br><div class=3D"gm=
ail_quote"><div class=3D"gmail_attr" dir=3D"ltr">On Fri, Nov 29, 2019 at 11=
:52 PM Josh Snyder &lt;<a href=3D"mailto:josh@code406.com">josh@code406.com=
</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);borde=
r-left-width:1px;border-left-style:solid"><div dir=3D"ltr"><div>Hello Shish=
ir,</div><div><br></div><div>It shouldn&#39;t be necessary to take downtime=
 to perform upgrades of a Cassandra cluster. It sounds like the biggest iss=
ue you&#39;re facing is the upgradesstables step. upgradesstables is not st=
rictly necessary before a Cassandra node re-enters the cluster to serve tra=
ffic; in my experience it is purely for optimizing the performance of the d=
atabase once the software upgrade is complete. I recommend trying out an up=
grade in a test environment without using upgradesstables, which should bri=
ng the 5 hours per node down to just a few minutes.</div><div><br></div><di=
v>If you&#39;re running NetworkTopologyStrategy and you want to optimize fu=
rther, you could consider performing the upgrade on multiple nodes within t=
he same rack in parallel. When correctly configured, NetworkTopologyStrateg=
y can protect your database from an outage of an entire rack. So performing=
 an upgrade on a few nodes at a time within a rack  is the same as a partia=
l rack outage, from the database&#39;s perspective.</div><div><br></div><di=
v>Have a nice upgrade!<br></div><div><br></div><div>Josh<br></div></div><br=
><div class=3D"gmail_quote"><div class=3D"gmail_attr" dir=3D"ltr">On Fri, N=
ov 29, 2019 at 7:22 AM Shishir Kumar &lt;<a href=3D"mailto:shishirroy2000@g=
mail.com" target=3D"_blank">shishirroy2000@gmail.com</a>&gt; wrote:<br></di=
v><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;paddi=
ng-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border=
-left-style:solid"><div dir=3D"auto">Hi,<div dir=3D"auto"><br></div><div di=
r=3D"auto">Need input on cassandra upgrade strategy for below:</div><div di=
r=3D"auto">1. We have Datacenter across 4 geography (multiple isolated depl=
oyments in each DC).</div><div dir=3D"auto">2. Number of Cassandra nodes in=
 each deployment is between 6 to 24</div><div dir=3D"auto">3. Data volume o=
n each nodes between 150 to 400 GB</div><div dir=3D"auto">4. All production=
 environment has DR set up</div><div dir=3D"auto">5. During upgrade we do n=
ot want downtime=C2=A0</div><div dir=3D"auto"><br></div><div dir=3D"auto">W=
e are planning to go for stack upgrade but=C2=A0upgradesstables is taking a=
pprox. 5 hours per node (if data volume is approx 200 GB).=C2=A0</div><div =
dir=3D"auto">Options-=C2=A0</div><div dir=3D"auto">No downtime - As per rec=
ommendation (DataStax documentation) if we plan to upgrade one node at time=
 I.e. in sequence upgrade cycle for one environment will take weeks, so Dev=
Ops concern.</div><div dir=3D"auto">Read Only (No downtime) - Route read on=
ly load to DR system. We have resilience built up to take care of mutation =
scenarios. But incase it takes more than say 3-4 hours, there will be long =
catch up exercise. Maintenance cost seems too high due to unknowns=C2=A0</d=
iv><div dir=3D"auto">Downtime- Can upgrade all nodes in parallel as no live=
 customers. This has direct Customer impact, so need to convince on mainten=
ance cost vs customer impact.</div><div dir=3D"auto">Please suggest how oth=
er Organisation are solving this scenario (whom have 100+ nodes)</div><div =
dir=3D"auto"><br></div><div dir=3D"auto">Regards=C2=A0</div><div dir=3D"aut=
o">Shishir=C2=A0</div><div dir=3D"auto"><br></div></div>
</blockquote></div>
</blockquote></div></div></div></div></div></div></div></div>

--0000000000000f17a205988931e9--