Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7724E10C27 for ; Wed, 2 Oct 2013 19:02:43 +0000 (UTC) Received: (qmail 71912 invoked by uid 500); 2 Oct 2013 19:02:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 71874 invoked by uid 500); 2 Oct 2013 19:02:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 71857 invoked by uid 99); 2 Oct 2013 19:02:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 19:02:39 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of pauloricardomg@gmail.com designates 209.85.220.46 as permitted sender) Received: from [209.85.220.46] (HELO mail-pa0-f46.google.com) (209.85.220.46) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 19:02:30 +0000 Received: by mail-pa0-f46.google.com with SMTP id fa1so1439627pad.19 for ; Wed, 02 Oct 2013 12:02:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=qVmlySCbDYd2zRspzCEoX2spEJIbbxbUjrDaSEMbYJw=; b=Z6gc9j/GjQGeY8bYJrN3tx8FmorZnQKwx1chfE9kYh4MYQgWmnR4qYCMidNjPitiNM 8Qd5SQ60IFh4pwWfppePzDypOHtPcgh5kSqh33pAT5XpDgV7ibAfWjgRO04XRa2PEaSz 5CNssmjeywzavMAYJ/JnYPEjcJ5+oFyO8mgPR5oBH6BVHf4tiUZpkDLxKPDb4dU/fuig tiSe+RGa8lRIwCsVVqQ94YLHQAXFgngmbQjNjB8PZp5KMXd7JvnRCWB+HaZrq8yJtyid S6Qx+1Dk2ZfnEBKd48/5K/qR8ElUZI2/gWjXfr/molwJ75D6h9CyTlHAGjvyGI+fyIYI 9Fyw== X-Received: by 10.68.211.233 with SMTP id nf9mr4159381pbc.85.1380740528944; Wed, 02 Oct 2013 12:02:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.70.21.129 with HTTP; Wed, 2 Oct 2013 12:01:48 -0700 (PDT) In-Reply-To: References: From: Paulo Motta Date: Wed, 2 Oct 2013 16:01:48 -0300 Message-ID: Subject: Re: Best version to upgrade from 1.1.10 to 1.2.X To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=e89a8ff1c372c13f0a04e7c6b4d5 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8ff1c372c13f0a04e7c6b4d5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Nevermind the question. It was a firewall problem. Now the nodes between different versions are able to see ach other! =3D) Cheers, Paulo 2013/10/2 Paulo Motta > Hello, > > I just started the rolling upgrade procedure from 1.1.10 to 2.1.10. Our > strategy is to simultaneously upgrade one server from each replication > group. So, if we have a 6 nodes with RF=3D2, we will upgrade 3 nodes at a > time (from distinct replication groups). > > My question is: do the newly upgraded nodes show as "Down" in the > "nodetool ring" of the old cluster (1.1.10)? Because I thought that netwo= rk > compatibility meant nodes from a newer version would receive traffic (wri= te > + reads) from the previous version without problems. > > Cheers, > > Paulo > > > 2013/9/26 Paulo Motta > >> Hello Charles, >> >> Thank you very much for your detailed upgrade report. It'll be very >> helpful during our upgrade operation (even though we'll do a rolling >> production upgrade). >> >> I'll also share our findings during the upgrade here. >> >> Cheers, >> >> Paulo >> >> >> 2013/9/24 Charles Brophy >> >>> Hi Paulo, >>> >>> I just completed a migration from 1.1.10 to 1.2.10 and it was >>> surprisingly painless. >>> >>> The course of action that I took: >>> 1) describe cluster - make sure all nodes are on the same schema >>> 2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is >>> going to kick off in the middle of what you're doing >>> 3) snapshot - maybe not necessary but it's so quick it makes no sense t= o >>> skip this step >>> 4) drain the nodes - I shut down the entire cluster rather than chance >>> any incompatible gossip concerns that might come from a rolling upgrade= . I >>> have the luxury of controlling both the providers and consumers of our >>> data, so this wasn't so disruptive for us. >>> 5) Upgrade the nodes, turn them on one-by-one, monitor the logs for >>> funny business. >>> 6) nodetool upgradesstables >>> 7) Turn various maintenance tasks back on, etc. >>> >>> The worst part was managing the yaml/config changes between the >>> versions. It wasn't horrible, but the diff was "noisier" than a more >>> incremental upgrade typically is. A few things I recall that were speci= al: >>> 1) Since you have an existing cluster, you'll probably need to set the >>> default partitioner back to RandomPartitioner in cassandra.yaml. I beli= eve >>> that is outlined in NEWS. >>> 2) I set the initial tokens to be the same as what the nodes held >>> previously. >>> 3) The timeout is now divided into more atomic settings and you get to >>> decided how (or if) to configure it from the default appropriately. >>> >>> tldr; I did a standard upgrade and payed careful attention to the >>> NEWS.txt upgrade notices. I did a full cluster restart and NOT a rollin= g >>> upgrade. It went without a hitch. >>> >>> Charles >>> >>> >>> >>> >>> >>> >>> On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta = wrote: >>> >>>> Cool, sounds fair enough. Thanks for the help, Rob! >>>> >>>> If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to >>>> share any tips on issues you're encountered that are not yet documente= d. >>>> >>>> Cheers, >>>> >>>> Paulo >>>> >>>> >>>> 2013/9/24 Robert Coli >>>> >>>>> On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta >>>> > wrote: >>>>> >>>>>> Doesn't the probability of something going wrong increases as the ga= p >>>>>> between the versions increase? So, using this reasoning, upgrading f= rom >>>>>> 1.1.10 to 1.2.6 would have less chance of something going wrong then= from >>>>>> 1.1.10 to 1.2.9 or 1.2.10. >>>>>> >>>>> >>>>> Sorta, but sorta not. >>>>> >>>>> https://github.com/apache/cassandra/blob/trunk/NEWS.txt >>>>> >>>>> Is the canonical source of concerns on upgrade. There are a few cases >>>>> where upgrading to the "root" of X.Y.Z creates issues that do not exi= st if >>>>> you upgrade to the "head" of that line. AFAIK there have been no case= s >>>>> where upgrading to the "head" of a line (where that line is mature, l= ike >>>>> 1.2.10) has created problems which would have been avoided by upgradi= ng to >>>>> the "root" first. >>>>> >>>>> >>>>>> I'm hoping this reasoning is wrong and I can update directly from >>>>>> 1.1.10 to 1.2.10. :-) >>>>>> >>>>> >>>>> That's what I plan to do when we move to 1.2.X, FWIW. >>>>> >>>>> =3DRob >>>>> >>>> >>>> >>>> >>>> -- >>>> Paulo Ricardo >>>> >>>> -- >>>> European Master in Distributed Computing*** >>>> Royal Institute of Technology - KTH >>>> * >>>> *Instituto Superior T=E9cnico - IST* >>>> *http://paulormg.com* >>>> >>> >>> >> >> >> -- >> Paulo Ricardo >> >> -- >> European Master in Distributed Computing*** >> Royal Institute of Technology - KTH >> * >> *Instituto Superior T=E9cnico - IST* >> *http://paulormg.com* >> > > > > -- > Paulo Ricardo > > -- > European Master in Distributed Computing*** > Royal Institute of Technology - KTH > * > *Instituto Superior T=E9cnico - IST* > *http://paulormg.com* > --=20 Paulo Ricardo --=20 European Master in Distributed Computing*** Royal Institute of Technology - KTH * *Instituto Superior T=E9cnico - IST* *http://paulormg.com* --e89a8ff1c372c13f0a04e7c6b4d5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Nevermind the question. It was a firewall problem. Now the= nodes between different versions are able to see ach other! =3D)

C= heers,

Paulo


2013/10/2 Paulo Motta <pauloricardomg@gmail.com>
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
Hello,

I just started the rolling upgra= de procedure from 1.1.10 to 2.1.10. Our strategy is to simultaneously upgra= de one server from each replication group. So, if we have a 6 nodes with RF= =3D2, we will upgrade 3 nodes at a time (from distinct replication groups).=

My question is: do the newly upgraded nodes show as &qu= ot;Down" in the "nodetool ring" of the old cluster (1.1.10)?= Because I thought that network compatibility meant nodes from a newer vers= ion would receive traffic (write + reads) from the previous version without= problems.

Cheers,

Paulo=A0


2013/9/26 Paulo Motta <pauloricardomg@gmai= l.com>
Hello Charles,

Thank you very much for your detailed upgrade report. It'll be v= ery helpful during our upgrade operation (even though we'll do a rollin= g production upgrade).

I'll also share our findings during the upgrade here.

Cheers,

Paulo


2013/9/24 Charles Brophy <cbrophy@zulily.com>
Hi Paulo,

I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingl= y painless.=A0

The course of action that I took:
1) describe= cluster - make sure all nodes are on the same schema
2) shutoff all maintenance tasks; i.e. make sure no scheduled re= pair is going to kick off in the middle of what you're doing
= 3) snapshot - maybe not necessary but it's so quick it makes no sense t= o skip this step
4) drain the nodes - I shut down the entire cluster rather than = chance any incompatible gossip concerns that might come from a rolling upgr= ade. I have the luxury of controlling both the providers and consumers of o= ur data, so this wasn't so disruptive for us.
5) Upgrade the nodes, turn them on one-by-one, monitor the logs for fu= nny business.
6) nodetool upgradesstables
7) Turn vario= us maintenance tasks back on, etc.

The worst part = was managing the yaml/config changes between the versions. It wasn't ho= rrible, but the diff was "noisier" than a more incremental upgrad= e typically is. A few things I recall that were special:
1) Since you have an existing cluster, you'll probably need to set= the default partitioner back to RandomPartitioner in cassandra.yaml. I bel= ieve that is outlined in NEWS.=A0
2) I set the initial tokens to = be the same as what the nodes held previously.=A0
3) The timeout is now divided into more atomic settings and you get to= decided how (or if) to configure it from the default appropriately.
<= div>
tldr; I did a standard upgrade and payed careful attenti= on to the NEWS.txt upgrade notices. I did a full cluster restart and NOT a = rolling upgrade. It went without a hitch.

Charles




=

On Tue, Sep 24, 2013 at 2:33 PM, Paulo M= otta <pauloricardomg@gmail.com> wrote:
Cool, sounds fair enough. T= hanks for the help, Rob!

If anyone has upgraded from 1.1= .X to 1.2.X, please feel invited to share any tips on issues you're enc= ountered that are not yet documented.

Cheers,

Paulo


2013/9/24 Robert C= oli <rcoli@eventbrite.com>
On Tue, Sep 24, 2013 a= t 1:41 PM, Paulo Motta <pauloricardomg@gmail.com> wro= te:
Doesn't the probability of somet= hing going wrong increases as the gap between the versions increase? So, us= ing this reasoning, upgrading from 1.1.10 to 1.2.6 would have less chance o= f something going wrong then from 1.1.10 to 1.2.9 or 1.2.10.

Sorta, but sorta not.=A0
=


Is the canonical source of concerns on upgrade. There a= re a few cases where upgrading to the "root" of X.Y.Z creates iss= ues that do not exist if you upgrade to the "head" of that line. = AFAIK there have been no cases where upgrading to the "head" of a= line (where that line is mature, like 1.2.10) has created problems which w= ould have been avoided by upgrading to the "root" first.
=A0
I'm hoping this re= asoning is wrong and I can update directly from 1.1.10 to 1.2.10. :-)

That's what I plan to do w= hen we move to 1.2.X, FWIW.

=3DRob



--
Paulo Ricardo

-- <= br>European Master in Distributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST




--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST



--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST



--
=
Paulo Ricardo

--
European Master in Dist= ributed Computing
Royal Institute of Technology -=A0KTH
Instituto= Superior T=E9cnico - IST
--e89a8ff1c372c13f0a04e7c6b4d5--