From user-return-60087-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org  Tue Feb 27 18:47:51 2018
Return-Path: <user-return-60087-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 375B2180651
	for <archive-asf-public@cust-asf.ponee.io>; Tue, 27 Feb 2018 18:47:50 +0100 (CET)
Received: (qmail 97453 invoked by uid 500); 27 Feb 2018 17:47:48 -0000
Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@cassandra.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@cassandra.apache.org>
List-Post: <mailto:user@cassandra.apache.org>
List-Id: <user.cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Delivered-To: mailing list user@cassandra.apache.org
Received: (qmail 97437 invoked by uid 99); 27 Feb 2018 17:47:48 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Feb 2018 17:47:48 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C3AF1C0564
	for <user@cassandra.apache.org>; Tue, 27 Feb 2018 17:47:47 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 2.88
X-Spam-Level: **
X-Spam-Status: No, score=2.88 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	FREEMAIL_REPLY=1, HTML_MESSAGE=2, MIME_QP_LONG_LINE=0.001,
	RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01,
	RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled
Authentication-Results: spamd4-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-eu.apache.org ([10.40.0.8])
	by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024)
	with ESMTP id evBkxsthxRLu for <user@cassandra.apache.org>;
	Tue, 27 Feb 2018 17:47:45 +0000 (UTC)
Received: from mail-pg0-f68.google.com (mail-pg0-f68.google.com [74.125.83.68])
	by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6EB6E5F175
	for <user@cassandra.apache.org>; Tue, 27 Feb 2018 17:47:44 +0000 (UTC)
Received: by mail-pg0-f68.google.com with SMTP id i14so2882749pgv.3
        for <user@cassandra.apache.org>; Tue, 27 Feb 2018 09:47:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=from:content-transfer-encoding:mime-version:date:subject:message-id
         :references:in-reply-to:to;
        bh=s+tGLbVShA60UAMu4sLh32y9MV3dE4wZYlH29MzoPMw=;
        b=XwVfd6XUtR1rvQs4EYEE85FyXnGea9I45v/fp2q6n0V5JcIKTejvB1QxYfvfC2q7B3
         UvnHNXbTCE569OsflYZFp4NqOGt/lvUQIC/w4eofhDcXAAJtj9fAqMpSu8dX+gghjFXR
         hspEniHfng3dS7vkeQZ9UOF2/6Z73pmGK6YBr+jKaQN6oY+/9U/2yEdhWGUtnsrBDMsP
         MqXFpFX0RJRMzjf1TSGBzDqc1lrvIJkayP1jrStRRhCVu6yM7vj+e1x1RlTLni8E7V8e
         AaBCS7nYHohyh6GyGkPUZ/BAw+wT3IvhqVzp8Wpd/p088zmSC5PF4VG71VZkSYz3ebSx
         ZsPw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:from:content-transfer-encoding:mime-version:date
         :subject:message-id:references:in-reply-to:to;
        bh=s+tGLbVShA60UAMu4sLh32y9MV3dE4wZYlH29MzoPMw=;
        b=Wo8WJZcgFA0EPjum5gXigdRPXq7D66Oo9SgWFoYtRhRwOYn3K5akzOk6m7Apij04tb
         Nurp2unf+2+RbphW3+ctvcB92Gcet0/bR3lqht3uw6dCttb+S3tHRRgB7qNwa/FEfm1x
         OKO2yxi5HsHivm8mNwyK3sFvflXrVDwU8PtrjV+B0SDv90LWiUY2hzTVv6t+tL/s3OTY
         8IOTL0TTSJqrDqMVFdUB+Sv94Gt8uJhszGYxGJqzHtNyEtrMHURoakzs1HNhehDw86jo
         3wKyoTN5GCAXPRuRTukChE13zhPB+bs2pJpVIcNKWEP4PGjwGC0DFhPe+iCtGWJlfho4
         RApg==
X-Gm-Message-State: APf1xPBPuQ+uLkodAHpblLT7NPOqufiTlJahSmEW2uVjn9p7u8g7SWiJ
	4emcCCY+GbpEBxU0v7igVfF1lP7K
X-Google-Smtp-Source: AH8x227KqRhU1MYvCHR4HRdrMagzy7zFhuWdsHqkEQJS/+nQsBcOZMkJDU0cydFPGAJuWQy21YB+OQ==
X-Received: by 10.101.100.214 with SMTP id t22mr12107818pgv.333.1519753662404;
        Tue, 27 Feb 2018 09:47:42 -0800 (PST)
Received: from [17.115.147.124] ([17.115.147.124])
        by smtp.gmail.com with ESMTPSA id e26sm22344799pff.90.2018.02.27.09.47.41
        for <user@cassandra.apache.org>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Tue, 27 Feb 2018 09:47:41 -0800 (PST)
From: Jeff Jirsa <jjirsa@gmail.com>
Content-Type: multipart/alternative;
	boundary=Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (1.0)
Date: Tue, 27 Feb 2018 09:47:40 -0800
Subject: Re: Version Rollback
Message-Id: <34DC363B-D7B8-4027-81A6-5F774A928AC1@gmail.com>
References: <CAM3v5Fu83d0QHP-Q0JANA-QW4h2jap3WuOHv5XKg8=bsHMEQXw@mail.gmail.com> <000301d3afd6$7eae2340$7c0a69c0$@yahoo.com> <CAFjxgDxTcc8k+hJ+abuBA+4fEpnFdOEGi7JBAkksU27vWJkXPw@mail.gmail.com>
In-Reply-To: <CAFjxgDxTcc8k+hJ+abuBA+4fEpnFdOEGi7JBAkksU27vWJkXPw@mail.gmail.com>
To: user@cassandra.apache.org
X-Mailer: iPhone Mail (15D60)

--Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

MOST minor versions support rollback - the exceptions are those where intern=
ode protocol changes (3.0.14 being the only one in recent memory), or where s=
stable format changes (again rare). No major versions support rollback - the=
 only way to do it is to upgrade in a way that you can effectively reinstall=
 the old version without data loss.

The steps usually look like:

Test in a lab
Test in a lab again
Test in a lab a few more times
Snapshot everything=20

If you have a passive data center:
- upgrade one instance
- check to see if it=E2=80=99s happy
- upgrade another
- check to see if it=E2=80=99s happy
- continue until the passive dc is done
- if at any point they=E2=80=99re unhappy rebuild (wipe and restream the old=
 version) the dc from the active dc

On the active DCs, you=E2=80=99ll want to canary it one replica at a time so=
 you can treat a failed upgrade like a bad disk:
- upgrade one instance
- check if it=E2=80=99s happy; if it=E2=80=99s not treat it like a failed di=
sk and replace it with the old version
- if you=E2=80=99re using single token, do another instance in a different r=
eplica set, repeat until you=E2=80=99re out of different replicas.=20
- if you=E2=80=99re using vnodes but a rack aware snitch and have more racks=
 than your RF, do another instance in the same rack as the canary, repeat un=
til you=E2=80=99re out of instances in that rack

This is typically your point of no return - as soon as you have two replicas=
 in the new version there=E2=80=99s no more rollback practical.=20


--=20
Jeff Jirsa


> On Feb 27, 2018, at 9:22 AM, Carl Mueller <carl.mueller@smartthings.com> w=
rote:
>=20
> My speculation is that IF (bigif) the sstable formats are compatible betwe=
en the versions, which probably isn't the case for major versions, then you c=
ould drop back.=20
>=20
> If the sstables changed format, then you'll probably need to figure out ho=
w to rewrite the sstables in the older format and then sstableloader them in=
 the older-version cluster if need be. Alas, while there is an sstable upgra=
der, there isn't a downgrader AFAIK.=20
>=20
> And I don't have an intimate view of version-by-version sstable format cha=
nges and compatibilities. You'd probably need to check the upgrade instructi=
ons (which you presumably did if you're upgrading versions) to tell.
>=20
> Basically, version rollback is pretty unlikely to be done.
>=20
> The OTHER option:
>=20
> 1) build a new cluster with the new version, no new data.=20
>=20
> 2) code your driver interfaces to interface with both clusters. Write to b=
oth, but read preferentially from the new, then fall through to the old. Yes=
, that gets hairy on multiple row queries. Port your data with sstable loadi=
ng from the old to the new gradually.=20
>=20
> When you've done a full load of all the data from old to new, and you're s=
atisfied with the new cluster stability, retire the old cluster.
>=20
> For merging two multirow sets you'll probably need your multirow queries t=
o return the partition hash value (or extract the code that generates the ha=
sh), and have two simulaneous java-driver ResultSets going, and merge their r=
esults, providing the illusion of a single database query. You'll need to pa=
y attention to both the row key ordering and column key ordering to ensure t=
he combined results are properly ordered.
>=20
> Writes will be slowed by the double-writes, reads you'll be bound by the w=
orse performing cluster.
>=20
>> On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <kenbrotman@yahoo.com.in=
valid> wrote:
>> Could you tell us the size and configuration of your Cassandra cluster?
>>=20
>> =20
>>=20
>> Kenneth Brotman
>>=20
>> =20
>>=20
>> From: shalom sagges [mailto:shalomsagges@gmail.com]=20
>> Sent: Tuesday, February 27, 2018 6:19 AM
>> To: user@cassandra.apache.org
>> Subject: Version Rollback
>>=20
>> =20
>>=20
>> Hi All,
>>=20
>> I'm planning to upgrade my C* cluster to version 3.x and was wondering wh=
at's the best way to perform a rollback if need be.
>>=20
>> If I used snapshot restoration, I would be facing data loss, depends when=
 I took the snapshot (i.e. a rollback might be required after upgrading half=
 the cluster for example).
>>=20
>> If I add another DC to the cluster with the old version, then I could poi=
nt the apps to talk to that DC if anything bad happens, but building it is r=
eally time consuming and requires a lot of resources.
>>=20
>> Can anyone provide recommendations on this matter? Any ideas on how to ma=
ke the upgrade foolproof, or at least "really really safe"?
>>=20
>> =20
>>=20
>> Thanks!
>>=20
>> =20
>>=20
>=20

--Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D=
utf-8"></head><body dir=3D"auto">MOST minor versions support rollback - the e=
xceptions are those where internode protocol changes (3.0.14 being the only o=
ne in recent memory), or where sstable format changes (again rare). No major=
 versions support rollback - the only way to do it is to upgrade in a way th=
at you can effectively reinstall the old version without data loss.<div><br>=
</div><div>The steps usually look like:</div><div><br></div><div>Test in a l=
ab</div><div>Test in a lab again</div><div>Test in a lab a few more times</d=
iv><div>Snapshot everything&nbsp;</div><div><br></div><div>If you have a pas=
sive data center:</div><div>- upgrade one instance</div><div>- check to see i=
f it=E2=80=99s happy</div><div>- upgrade another</div><div>- check to see if=
 it=E2=80=99s happy</div><div>- continue until the passive dc is done</div><=
div>- if at any point they=E2=80=99re unhappy rebuild (wipe and restream the=
 old version) the dc from the active dc</div><div><br></div><div>On the acti=
ve DCs, you=E2=80=99ll want to canary it one replica at a time so you can tr=
eat a failed upgrade like a bad disk:</div><div>- upgrade one instance</div>=
<div>- check if it=E2=80=99s happy; if it=E2=80=99s not treat it like a fail=
ed disk and replace it with the old version</div><div>- if you=E2=80=99re us=
ing single token, do another instance in a different replica set, repeat unt=
il you=E2=80=99re out of different replicas.&nbsp;</div><div>- if you=E2=80=99=
re using vnodes but a rack aware snitch and have more racks than your RF, do=
 another instance in the same rack as the canary, repeat until you=E2=80=99r=
e out of instances in that rack</div><div><br></div><div>This is typically y=
our point of no return - as soon as you have two replicas in the new version=
 there=E2=80=99s no more rollback practical.&nbsp;</div><div><br></div><div>=
<br><br><div id=3D"AppleMailSignature">--&nbsp;<div>Jeff Jirsa</div><div><br=
></div></div><div><br>On Feb 27, 2018, at 9:22 AM, Carl Mueller &lt;<a href=3D=
"mailto:carl.mueller@smartthings.com">carl.mueller@smartthings.com</a>&gt; w=
rote:<br><br></div><blockquote type=3D"cite"><div><div dir=3D"ltr">My specul=
ation is that IF (bigif) the sstable formats are compatible between the vers=
ions, which probably isn't the case for major versions, then you could drop b=
ack.&nbsp;<br><br>If the sstables changed format, then you'll probably need t=
o figure out how to rewrite the sstables in the older format and then sstabl=
eloader them in the older-version cluster if need be. Alas, while there is a=
n sstable upgrader, there isn't a downgrader AFAIK.&nbsp;<br><br>And I don't=
 have an intimate view of version-by-version sstable format changes and comp=
atibilities. You'd probably need to check the upgrade instructions (which yo=
u presumably did if you're upgrading versions) to tell.<br><br>Basically, ve=
rsion rollback is pretty unlikely to be done.<br><br>The OTHER option:<br><b=
r>1) build a new cluster with the new version, no new data.&nbsp;<br><br>2) c=
ode your driver interfaces to interface with both clusters. Write to both, b=
ut read preferentially from the new, then fall through to the old. Yes, that=
 gets hairy on multiple row queries. Port your data with sstable loading fro=
m the old to the new gradually.&nbsp;<br><br>When you've done a full load of=
 all the data from old to new, and you're satisfied with the new cluster sta=
bility, retire the old cluster.<br><br>For merging two multirow sets you'll p=
robably need your multirow queries to return the partition hash value (or ex=
tract the code that generates the hash), and have two simulaneous java-drive=
r ResultSets going, and merge their results, providing the illusion of a sin=
gle database query. You'll need to pay attention to both the row key orderin=
g and column key ordering to ensure the combined results are properly ordere=
d.<br><br>Writes will be slowed by the double-writes, reads you'll be bound b=
y the worse performing cluster.</div><div class=3D"gmail_extra"><br><div cla=
ss=3D"gmail_quote">On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <span di=
r=3D"ltr">&lt;<a href=3D"mailto:kenbrotman@yahoo.com.invalid" target=3D"_bla=
nk">kenbrotman@yahoo.com.invalid</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-=
left:1ex"><div lang=3D"EN-US" link=3D"blue" vlink=3D"purple"><div class=3D"m=
_8961535419065537103WordSection1"><p class=3D"MsoNormal"><span style=3D"font=
-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1=
f497d">Could you tell us the size and configuration of your Cassandra cluste=
r?<u></u><u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-size:1=
1.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d">=
<u></u>&nbsp;<u></u></span></p><p class=3D"MsoNormal"><span style=3D"font-si=
ze:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1f49=
7d">Kenneth Brotman<u></u><u></u></span></p><p class=3D"MsoNormal"><span sty=
le=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot=
;;color:#1f497d"><u></u>&nbsp;<u></u></span></p><p class=3D"MsoNormal"><b><s=
pan style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-seri=
f&quot;">From:</span></b><span style=3D"font-size:10.0pt;font-family:&quot;T=
ahoma&quot;,&quot;sans-serif&quot;"> shalom sagges [mailto:<a href=3D"mailto=
:shalomsagges@gmail.com" target=3D"_blank">shalomsagges@gmail.com</a><wbr>] <=
br><b>Sent:</b> Tuesday, February 27, 2018 6:19 AM<br><b>To:</b> <a href=3D"=
mailto:user@cassandra.apache.org" target=3D"_blank">user@cassandra.apache.or=
g</a><br><b>Subject:</b> Version Rollback<u></u><u></u></span></p><div><div c=
lass=3D"h5"><p class=3D"MsoNormal"><u></u>&nbsp;<u></u></p><div><div><div><d=
iv><div><div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Hi All, <=
u></u><u></u></p></div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"=
>I'm planning to upgrade my C* cluster to version 3.x and was wondering what=
's the best way to perform a rollback if need be. <u></u><u></u></p></div><p=
 class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">If I used snapshot resto=
ration, I would be facing data loss, depends when I took the snapshot (i.e. a=
 rollback might be required after upgrading half the cluster for example). <=
u></u><u></u></p></div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"=
>If I add another DC to the cluster with the old version, then I could point=
 the apps to talk to that DC if anything bad happens, but building it is rea=
lly time consuming and requires a lot of resources. <u></u><u></u></p></div>=
<div><p class=3D"MsoNormal">Can anyone provide recommendations on this matte=
r? Any ideas on how to make the upgrade foolproof, or at least "really reall=
y safe"? <u></u><u></u></p></div><p class=3D"MsoNormal"><u></u>&nbsp;<u></u>=
</p></div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Thanks!<u></=
u><u></u></p><div><div><p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"=
><u></u>&nbsp;<u></u></p></div></div></div></div></div></div></div></blockqu=
ote></div><br></div>
</div></blockquote></div></body></html>=

--Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1--