From user-return-60087-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Tue Feb 27 18:47:51 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 375B2180651 for ; Tue, 27 Feb 2018 18:47:50 +0100 (CET) Received: (qmail 97453 invoked by uid 500); 27 Feb 2018 17:47:48 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 97437 invoked by uid 99); 27 Feb 2018 17:47:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Feb 2018 17:47:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C3AF1C0564 for ; Tue, 27 Feb 2018 17:47:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.88 X-Spam-Level: ** X-Spam-Status: No, score=2.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=2, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id evBkxsthxRLu for ; Tue, 27 Feb 2018 17:47:45 +0000 (UTC) Received: from mail-pg0-f68.google.com (mail-pg0-f68.google.com [74.125.83.68]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6EB6E5F175 for ; Tue, 27 Feb 2018 17:47:44 +0000 (UTC) Received: by mail-pg0-f68.google.com with SMTP id i14so2882749pgv.3 for ; Tue, 27 Feb 2018 09:47:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:date:subject:message-id :references:in-reply-to:to; bh=s+tGLbVShA60UAMu4sLh32y9MV3dE4wZYlH29MzoPMw=; b=XwVfd6XUtR1rvQs4EYEE85FyXnGea9I45v/fp2q6n0V5JcIKTejvB1QxYfvfC2q7B3 UvnHNXbTCE569OsflYZFp4NqOGt/lvUQIC/w4eofhDcXAAJtj9fAqMpSu8dX+gghjFXR hspEniHfng3dS7vkeQZ9UOF2/6Z73pmGK6YBr+jKaQN6oY+/9U/2yEdhWGUtnsrBDMsP MqXFpFX0RJRMzjf1TSGBzDqc1lrvIJkayP1jrStRRhCVu6yM7vj+e1x1RlTLni8E7V8e AaBCS7nYHohyh6GyGkPUZ/BAw+wT3IvhqVzp8Wpd/p088zmSC5PF4VG71VZkSYz3ebSx ZsPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version:date :subject:message-id:references:in-reply-to:to; bh=s+tGLbVShA60UAMu4sLh32y9MV3dE4wZYlH29MzoPMw=; b=Wo8WJZcgFA0EPjum5gXigdRPXq7D66Oo9SgWFoYtRhRwOYn3K5akzOk6m7Apij04tb Nurp2unf+2+RbphW3+ctvcB92Gcet0/bR3lqht3uw6dCttb+S3tHRRgB7qNwa/FEfm1x OKO2yxi5HsHivm8mNwyK3sFvflXrVDwU8PtrjV+B0SDv90LWiUY2hzTVv6t+tL/s3OTY 8IOTL0TTSJqrDqMVFdUB+Sv94Gt8uJhszGYxGJqzHtNyEtrMHURoakzs1HNhehDw86jo 3wKyoTN5GCAXPRuRTukChE13zhPB+bs2pJpVIcNKWEP4PGjwGC0DFhPe+iCtGWJlfho4 RApg== X-Gm-Message-State: APf1xPBPuQ+uLkodAHpblLT7NPOqufiTlJahSmEW2uVjn9p7u8g7SWiJ 4emcCCY+GbpEBxU0v7igVfF1lP7K X-Google-Smtp-Source: AH8x227KqRhU1MYvCHR4HRdrMagzy7zFhuWdsHqkEQJS/+nQsBcOZMkJDU0cydFPGAJuWQy21YB+OQ== X-Received: by 10.101.100.214 with SMTP id t22mr12107818pgv.333.1519753662404; Tue, 27 Feb 2018 09:47:42 -0800 (PST) Received: from [17.115.147.124] ([17.115.147.124]) by smtp.gmail.com with ESMTPSA id e26sm22344799pff.90.2018.02.27.09.47.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Feb 2018 09:47:41 -0800 (PST) From: Jeff Jirsa Content-Type: multipart/alternative; boundary=Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (1.0) Date: Tue, 27 Feb 2018 09:47:40 -0800 Subject: Re: Version Rollback Message-Id: <34DC363B-D7B8-4027-81A6-5F774A928AC1@gmail.com> References: <000301d3afd6$7eae2340$7c0a69c0$@yahoo.com> In-Reply-To: To: user@cassandra.apache.org X-Mailer: iPhone Mail (15D60) --Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable MOST minor versions support rollback - the exceptions are those where intern= ode protocol changes (3.0.14 being the only one in recent memory), or where s= stable format changes (again rare). No major versions support rollback - the= only way to do it is to upgrade in a way that you can effectively reinstall= the old version without data loss. The steps usually look like: Test in a lab Test in a lab again Test in a lab a few more times Snapshot everything=20 If you have a passive data center: - upgrade one instance - check to see if it=E2=80=99s happy - upgrade another - check to see if it=E2=80=99s happy - continue until the passive dc is done - if at any point they=E2=80=99re unhappy rebuild (wipe and restream the old= version) the dc from the active dc On the active DCs, you=E2=80=99ll want to canary it one replica at a time so= you can treat a failed upgrade like a bad disk: - upgrade one instance - check if it=E2=80=99s happy; if it=E2=80=99s not treat it like a failed di= sk and replace it with the old version - if you=E2=80=99re using single token, do another instance in a different r= eplica set, repeat until you=E2=80=99re out of different replicas.=20 - if you=E2=80=99re using vnodes but a rack aware snitch and have more racks= than your RF, do another instance in the same rack as the canary, repeat un= til you=E2=80=99re out of instances in that rack This is typically your point of no return - as soon as you have two replicas= in the new version there=E2=80=99s no more rollback practical.=20 --=20 Jeff Jirsa > On Feb 27, 2018, at 9:22 AM, Carl Mueller w= rote: >=20 > My speculation is that IF (bigif) the sstable formats are compatible betwe= en the versions, which probably isn't the case for major versions, then you c= ould drop back.=20 >=20 > If the sstables changed format, then you'll probably need to figure out ho= w to rewrite the sstables in the older format and then sstableloader them in= the older-version cluster if need be. Alas, while there is an sstable upgra= der, there isn't a downgrader AFAIK.=20 >=20 > And I don't have an intimate view of version-by-version sstable format cha= nges and compatibilities. You'd probably need to check the upgrade instructi= ons (which you presumably did if you're upgrading versions) to tell. >=20 > Basically, version rollback is pretty unlikely to be done. >=20 > The OTHER option: >=20 > 1) build a new cluster with the new version, no new data.=20 >=20 > 2) code your driver interfaces to interface with both clusters. Write to b= oth, but read preferentially from the new, then fall through to the old. Yes= , that gets hairy on multiple row queries. Port your data with sstable loadi= ng from the old to the new gradually.=20 >=20 > When you've done a full load of all the data from old to new, and you're s= atisfied with the new cluster stability, retire the old cluster. >=20 > For merging two multirow sets you'll probably need your multirow queries t= o return the partition hash value (or extract the code that generates the ha= sh), and have two simulaneous java-driver ResultSets going, and merge their r= esults, providing the illusion of a single database query. You'll need to pa= y attention to both the row key ordering and column key ordering to ensure t= he combined results are properly ordered. >=20 > Writes will be slowed by the double-writes, reads you'll be bound by the w= orse performing cluster. >=20 >> On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman wrote: >> Could you tell us the size and configuration of your Cassandra cluster? >>=20 >> =20 >>=20 >> Kenneth Brotman >>=20 >> =20 >>=20 >> From: shalom sagges [mailto:shalomsagges@gmail.com]=20 >> Sent: Tuesday, February 27, 2018 6:19 AM >> To: user@cassandra.apache.org >> Subject: Version Rollback >>=20 >> =20 >>=20 >> Hi All, >>=20 >> I'm planning to upgrade my C* cluster to version 3.x and was wondering wh= at's the best way to perform a rollback if need be. >>=20 >> If I used snapshot restoration, I would be facing data loss, depends when= I took the snapshot (i.e. a rollback might be required after upgrading half= the cluster for example). >>=20 >> If I add another DC to the cluster with the old version, then I could poi= nt the apps to talk to that DC if anything bad happens, but building it is r= eally time consuming and requires a lot of resources. >>=20 >> Can anyone provide recommendations on this matter? Any ideas on how to ma= ke the upgrade foolproof, or at least "really really safe"? >>=20 >> =20 >>=20 >> Thanks! >>=20 >> =20 >>=20 >=20 --Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable MOST minor versions support rollback - the e= xceptions are those where internode protocol changes (3.0.14 being the only o= ne in recent memory), or where sstable format changes (again rare). No major= versions support rollback - the only way to do it is to upgrade in a way th= at you can effectively reinstall the old version without data loss.

=
The steps usually look like:

Test in a l= ab
Test in a lab again
Test in a lab a few more times
Snapshot everything 

If you have a pas= sive data center:
- upgrade one instance
- check to see i= f it=E2=80=99s happy
- upgrade another
- check to see if= it=E2=80=99s happy
- continue until the passive dc is done
<= div>- if at any point they=E2=80=99re unhappy rebuild (wipe and restream the= old version) the dc from the active dc

On the acti= ve DCs, you=E2=80=99ll want to canary it one replica at a time so you can tr= eat a failed upgrade like a bad disk:
- upgrade one instance
=
- check if it=E2=80=99s happy; if it=E2=80=99s not treat it like a fail= ed disk and replace it with the old version
- if you=E2=80=99re us= ing single token, do another instance in a different replica set, repeat unt= il you=E2=80=99re out of different replicas. 
- if you=E2=80=99= re using vnodes but a rack aware snitch and have more racks than your RF, do= another instance in the same rack as the canary, repeat until you=E2=80=99r= e out of instances in that rack

This is typically y= our point of no return - as soon as you have two replicas in the new version= there=E2=80=99s no more rollback practical. 

=

-- 
Jeff Jirsa

On Feb 27, 2018, at 9:22 AM, Carl Mueller <carl.mueller@smartthings.com> w= rote:

My specul= ation is that IF (bigif) the sstable formats are compatible between the vers= ions, which probably isn't the case for major versions, then you could drop b= ack. 

If the sstables changed format, then you'll probably need t= o figure out how to rewrite the sstables in the older format and then sstabl= eloader them in the older-version cluster if need be. Alas, while there is a= n sstable upgrader, there isn't a downgrader AFAIK. 

And I don't= have an intimate view of version-by-version sstable format changes and comp= atibilities. You'd probably need to check the upgrade instructions (which yo= u presumably did if you're upgrading versions) to tell.

Basically, ve= rsion rollback is pretty unlikely to be done.

The OTHER option:
1) build a new cluster with the new version, no new data. 

2) c= ode your driver interfaces to interface with both clusters. Write to both, b= ut read preferentially from the new, then fall through to the old. Yes, that= gets hairy on multiple row queries. Port your data with sstable loading fro= m the old to the new gradually. 

When you've done a full load of= all the data from old to new, and you're satisfied with the new cluster sta= bility, retire the old cluster.

For merging two multirow sets you'll p= robably need your multirow queries to return the partition hash value (or ex= tract the code that generates the hash), and have two simulaneous java-drive= r ResultSets going, and merge their results, providing the illusion of a sin= gle database query. You'll need to pay attention to both the row key orderin= g and column key ordering to ensure the combined results are properly ordere= d.

Writes will be slowed by the double-writes, reads you'll be bound b= y the worse performing cluster.

On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <kenbrotman@yahoo.com.invalid> wrote:

Could you tell us the size and configuration of your Cassandra cluste= r?

=  

Kenneth Brotman

 

From: shalom sagges [mailto:shalomsagges@gmail.com] <= br>Sent: Tuesday, February 27, 2018 6:19 AM
To: user@cassandra.apache.or= g
Subject: Version Rollback

 

Hi All, <= u>

I'm planning to upgrade my C* cluster to version 3.x and was wondering what= 's the best way to perform a rollback if need be.

If I used snapshot resto= ration, I would be facing data loss, depends when I took the snapshot (i.e. a= rollback might be required after upgrading half the cluster for example). <= u>

If I add another DC to the cluster with the old version, then I could point= the apps to talk to that DC if anything bad happens, but building it is rea= lly time consuming and requires a lot of resources.

=

Can anyone provide recommendations on this matte= r? Any ideas on how to make the upgrade foolproof, or at least "really reall= y safe"?

 =

Thanks!

 


= --Apple-Mail-1BF796D7-A9D2-423E-B5C3-DFE43A9715F1--