Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4EC9F10BB2 for ; Wed, 2 Oct 2013 18:50:02 +0000 (UTC) Received: (qmail 41076 invoked by uid 500); 2 Oct 2013 18:49:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 40776 invoked by uid 500); 2 Oct 2013 18:49:53 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 40768 invoked by uid 99); 2 Oct 2013 18:49:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 18:49:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pauloricardomg@gmail.com designates 209.85.220.49 as permitted sender) Received: from [209.85.220.49] (HELO mail-pa0-f49.google.com) (209.85.220.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 18:49:48 +0000 Received: by mail-pa0-f49.google.com with SMTP id ld10so1415278pab.36 for ; Wed, 02 Oct 2013 11:49:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=Xd15/VRzAU3F3PnHaXdX9z0u0EkN0stbV3n3mnG/3Js=; b=lVdx/hyWyZ/80ET6SxGoy9ULQyDTzFISNoBAZ48h9UU8R1N/RfMR5kck+Go5sSrb2U CIgSLuxBHDqRoqhy0r+UgA/YibT3twDbJsjXLgruymzLsPXg1zbZ/m8nuMgHQ+yrLkCD lLQHzcCZr06+3V2j/iINeHiyc5rvIq+7XWYNfZ7WO+fOj338w0IdRtaMHqE5JEtkGQqL xA3TRXIm3fT1JdIs1zpGiHpc6JSfKzBLSKfvF0Fl2tORfgrJOY6wnR87Gy9MHWtAR7+m CGNY44f8ev/VqBXM2jJkUK9KDC7lc5LFKuCur9MucW8E2HPk2bRmVIKTfB0kcj297Ple Cw4Q== X-Received: by 10.66.190.198 with SMTP id gs6mr4717856pac.49.1380739767900; Wed, 02 Oct 2013 11:49:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.70.21.129 with HTTP; Wed, 2 Oct 2013 11:49:07 -0700 (PDT) In-Reply-To: References: From: Paulo Motta Date: Wed, 2 Oct 2013 15:49:07 -0300 Message-ID: Subject: Re: Best version to upgrade from 1.1.10 to 1.2.X To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=047d7bf0f2f464a07104e7c6875f X-Virus-Checked: Checked by ClamAV on apache.org --047d7bf0f2f464a07104e7c6875f Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello, I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our strategy is to simultaneously upgrade one server from each replication group. So, if we have a 6 nodes with RF=3D2, we will upgrade 3 nodes at a time (from distinct replication groups). My question is: do the newly upgraded nodes show as "Down" in the "nodetool ring" of the old cluster (1.1.10)? Because I thought that network compatibility meant nodes from a newer version would receive traffic (write + reads) from the previous version without problems. Cheers, Paulo 2013/9/26 Paulo Motta > Hello Charles, > > Thank you very much for your detailed upgrade report. It'll be very > helpful during our upgrade operation (even though we'll do a rolling > production upgrade). > > I'll also share our findings during the upgrade here. > > Cheers, > > Paulo > > > 2013/9/24 Charles Brophy > >> Hi Paulo, >> >> I just completed a migration from 1.1.10 to 1.2.10 and it was >> surprisingly painless. >> >> The course of action that I took: >> 1) describe cluster - make sure all nodes are on the same schema >> 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is >> going to kick off in the middle of what you're doing >> 3) snapshot - maybe not necessary but it's so quick it makes no sense to >> skip this step >> 4) drain the nodes - I shut down the entire cluster rather than chance >> any incompatible gossip concerns that might come from a rolling upgrade.= I >> have the luxury of controlling both the providers and consumers of our >> data, so this wasn't so disruptive for us. >> 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funn= y >> business. >> 6) nodetool upgradesstables >> 7) Turn various maintenance tasks back on, etc. >> >> The worst part was managing the yaml/config changes between the versions= . >> It wasn't horrible, but the diff was "noisier" than a more incremental >> upgrade typically is. A few things I recall that were special: >> 1) Since you have an existing cluster, you'll probably need to set the >> default partitioner back to RandomPartitioner in cassandra.yaml. I belie= ve >> that is outlined in NEWS. >> 2) I set the initial tokens to be the same as what the nodes held >> previously. >> 3) The timeout is now divided into more atomic settings and you get to >> decided how (or if) to configure it from the default appropriately. >> >> tldr; I did a standard upgrade and payed careful attention to the >> NEWS.txt upgrade notices. I did a full cluster restart and NOT a rolling >> upgrade. It went without a hitch. >> >> Charles >> >> >> >> >> >> >> On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta w= rote: >> >>> Cool, sounds fair enough. Thanks for the help, Rob! >>> >>> If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to shar= e >>> any tips on issues you're encountered that are not yet documented. >>> >>> Cheers, >>> >>> Paulo >>> >>> >>> 2013/9/24 Robert Coli >>> >>>> On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta wrote: >>>> >>>>> Doesn't the probability of something going wrong increases as the gap >>>>> between the versions increase? So, using this reasoning, upgrading fr= om >>>>> 1.1.10 to 1.2.6 would have less chance of something going wrong then = from >>>>> 1.1.10 to 1.2.9 or 1.2.10. >>>>> >>>> >>>> Sorta, but sorta not. >>>> >>>> https://github.com/apache/cassandra/blob/trunk/NEWS.txt >>>> >>>> Is the canonical source of concerns on upgrade. There are a few cases >>>> where upgrading to the "root" of X.Y.Z creates issues that do not exis= t if >>>> you upgrade to the "head" of that line. AFAIK there have been no cases >>>> where upgrading to the "head" of a line (where that line is mature, li= ke >>>> 1.2.10) has created problems which would have been avoided by upgradin= g to >>>> the "root" first. >>>> >>>> >>>>> I'm hoping this reasoning is wrong and I can update directly from >>>>> 1.1.10 to 1.2.10. :-) >>>>> >>>> >>>> That's what I plan to do when we move to 1.2.X, FWIW. >>>> >>>> =3DRob >>>> >>> >>> >>> >>> -- >>> Paulo Ricardo >>> >>> -- >>> European Master in Distributed Computing*** >>> Royal Institute of Technology - KTH >>> * >>> *Instituto Superior T=E9cnico - IST* >>> *http://paulormg.com* >>> >> >> > > > -- > Paulo Ricardo > > -- > European Master in Distributed Computing*** > Royal Institute of Technology - KTH > * > *Instituto Superior T=E9cnico - IST* > *http://paulormg.com* > --=20 Paulo Ricardo --=20 European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior T=E9cnico - IST* *http://paulormg.com* --047d7bf0f2f464a07104e7c6875f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hello,

I just started the rolling upgra= de procedure from 1.1.10 to 2.1.10. Our strategy is to simultaneously upgra= de one server from each replication group. So, if we have a 6 nodes with RF= =3D2, we will upgrade 3 nodes at a time (from distinct replication groups).=

My question is: do the newly upgraded nodes show as &qu= ot;Down" in the "nodetool ring" of the old cluster (1.1.10)?= Because I thought that network compatibility meant nodes from a newer vers= ion would receive traffic (write + reads) from the previous version without= problems.

Cheers,

Paulo=A0


2013/9/26 Paulo= Motta <pauloricardomg@gmail.com>
Hello Charles,

Thank you very much for your detailed upgrade report. It'll be v= ery helpful during our upgrade operation (even though we'll do a rollin= g production upgrade).

I'll also share our findings during the upgrade here.

Cheers,

Paulo


2013/9/24 Charles Brophy <cbrophy@zulily.com>
Hi Paulo,

I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingl= y painless.=A0

The course of action that I took:
1) describe= cluster - make sure all nodes are on the same schema
2) shutoff all maintenance tasks; i.e. make sure no scheduled re= pair is going to kick off in the middle of what you're doing
= 3) snapshot - maybe not necessary but it's so quick it makes no sense t= o skip this step
4) drain the nodes - I shut down the entire cluster rather than = chance any incompatible gossip concerns that might come from a rolling upgr= ade. I have the luxury of controlling both the providers and consumers of o= ur data, so this wasn't so disruptive for us.
5) Upgrade the nodes, turn them on one-by-one, monitor the logs for fu= nny business.
6) nodetool upgradesstables
7) Turn vario= us maintenance tasks back on, etc.

The worst part = was managing the yaml/config changes between the versions. It wasn't ho= rrible, but the diff was "noisier" than a more incremental upgrad= e typically is. A few things I recall that were special:
1) Since you have an existing cluster, you'll probably need to set= the default partitioner back to RandomPartitioner in cassandra.yaml. I bel= ieve that is outlined in NEWS.=A0
2) I set the initial tokens to = be the same as what the nodes held previously.=A0
3) The timeout is now divided into more atomic settings and you get to= decided how (or if) to configure it from the default appropriately.
<= div>
tldr; I did a standard upgrade and payed careful attenti= on to the NEWS.txt upgrade notices. I did a full cluster restart and NOT a = rolling upgrade. It went without a hitch.

Charles




=

On Tue, Sep 24, 2013 at 2:33 PM, Paulo M= otta <pauloricardomg@gmail.com> wrote:
Cool, sounds fair enough. T= hanks for the help, Rob!

If anyone has upgraded from 1.1= .X to 1.2.X, please feel invited to share any tips on issues you're enc= ountered that are not yet documented.

Cheers,

Paulo


2013/9/24 Robert C= oli <rcoli@eventbrite.com>
On Tue, Sep 24, 2013 a= t 1:41 PM, Paulo Motta <pauloricardomg@gmail.com> wro= te:
Doesn't the probability of somet= hing going wrong increases as the gap between the versions increase? So, us= ing this reasoning, upgrading from 1.1.10 to 1.2.6 would have less chance o= f something going wrong then from 1.1.10 to 1.2.9 or 1.2.10.

Sorta, but sorta not.=A0
=


Is the canonical source of concerns on upgrade. There a= re a few cases where upgrading to the "root" of X.Y.Z creates iss= ues that do not exist if you upgrade to the "head" of that line. = AFAIK there have been no cases where upgrading to the "head" of a= line (where that line is mature, like 1.2.10) has created problems which w= ould have been avoided by upgrading to the "root" first.
=A0
I'm hoping this re= asoning is wrong and I can update directly from 1.1.10 to 1.2.10. :-)

That's what I plan to do w= hen we move to 1.2.X, FWIW.

=3DRob



--
Paulo Ricardo

-- <= br>European Master in Distributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST




--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST



--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST
--047d7bf0f2f464a07104e7c6875f--