Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of rcoli@eventbrite.com designates
 209.85.220.177 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAF9x2_d9psa5OHQuUb-mFjEza9qB1N8krd-wDjidtH_n4AQQjg@mail.gmail.com>
References: 
 <CAF9x2_d9psa5OHQuUb-mFjEza9qB1N8krd-wDjidtH_n4AQQjg@mail.gmail.com>
Date: Mon, 2 Dec 2013 16:00:04 -0800
Message-ID: 
 <CAEDUwd3KE-pOV6NCcb8WdVsmMeA4WJVuHSA6RTJuemoaVaOw3A@mail.gmail.com>
Subject: Re: changing several things (almost) at once; is this the right order
 to make the changes?
From: Robert Coli <rcoli@eventbrite.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=001a11c2cc408af39204ec95fa2b

--001a11c2cc408af39204ec95fa2b
Content-Type: text/plain; charset=ISO-8859-1

On Mon, Dec 2, 2013 at 1:08 PM, Brian Tarbox <tarbox@cabotresearch.com>wrote:

> We're making several changes and I'd to confirm that our order of making
> them is reasonable.  Right now we have 4 node system at replicationFactor=2
> running 1.1.6.
>
> We've moving to a 6 node system at rf=3 running 1.2.12 (I guess).
>
> We think the order should be:
> 1) change to rf=3 and run repair on all nodes while still at 1.1.6
>

Yes, being aware that you will get false "no data" reads from the third
replica at CL.ONE until your repair completes.


> 2) upgrade to 1.1.10 (latest on that branch?)
>

Unless NEWS.txt specifies that you need to do this, you can probably skip
it. From memory, I believe you can skip it.


> 3) upgrade to 1.2.12 (latest on that branch?)
>

Yes.


>  4) run the convert-to-v_Node command
>

If you mean shuffle, I feel bound to tell you that no one has successfully
run shuffle on an actual production cluster[1]. I conjecture that you are
in production because you are running 1.1.x.

You might be the first to successfully run shuffle in production, but you
probably do not want to try to be?


> 5) add two more servers
>

If you're going to add servers anyway, you might want to do the "new
datacenter(s)" process for upgrading to Vnodes.

=Rob
[1] rbranson apparently did a shuffle-like activity successfully, but by
adding two additional DCs, one with a node with enough disk space to hold
the entire cluster's data...

--001a11c2cc408af39204ec95fa2b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Mon, Dec 2, 2013 at 1:08 PM, Brian Tarbox <span dir=3D"=
ltr">&lt;<a href=3D"mailto:tarbox@cabotresearch.com" target=3D"_blank">tarb=
ox@cabotresearch.com</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><d=
iv class=3D"gmail_quote">

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">We&#39;re making several ch=
anges and I&#39;d to confirm that our order of making them is reasonable. =
=A0Right now we have 4 node system at replicationFactor=3D2 running 1.1.6.<=
div>

<br></div><div>We&#39;ve moving to a 6 node system at rf=3D3 running 1.2.12=
 (I guess).</div>
<div><br></div><div>We think the order should be:</div><div>1) change to rf=
=3D3 and run repair on all nodes while still at 1.1.6=A0</div></div></block=
quote><div><br></div><div>Yes, being aware that you will get false &quot;no=
 data&quot; reads from the third replica at CL.ONE until your repair comple=
tes.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>2) upgra=
de to 1.1.10 (latest on that branch?)</div></div></blockquote><div><br></di=
v><div>

Unless NEWS.txt specifies that you need to do this, you can probably skip i=
t. From memory, I believe you can skip it.</div><div>=A0</div><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;=
padding-left:1ex">

<div dir=3D"ltr"><div>3) upgrade to 1.2.12 (latest on that branch?)</div></=
div></blockquote><div><br></div><div>Yes.</div><div>=A0</div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex">

<div dir=3D"ltr">
<div>4) run the convert-to-v_Node command</div></div></blockquote><div><br>=
</div><div>If you mean shuffle, I feel bound to tell you that no one has su=
ccessfully run shuffle on an actual production cluster[1]. I conjecture tha=
t you are in production because you are running 1.1.x.</div>

<div><br></div><div>You might be the first to successfully run shuffle in p=
roduction, but you probably do not want to try to be?</div><div>=A0</div><b=
lockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px =
#ccc solid;padding-left:1ex">

<div dir=3D"ltr"><div>5) add two more servers</div></div></blockquote><div>=
<br></div><div>If you&#39;re going to add servers anyway, you might want to=
 do the &quot;new datacenter(s)&quot; process for upgrading to Vnodes.</div=
>

<div><br></div><div>=3DRob=A0</div><div>[1] rbranson apparently did a shuff=
le-like activity successfully, but by adding two additional DCs, one with a=
 node with enough disk space to hold the entire cluster&#39;s data...</div>
</div></div></div>

--001a11c2cc408af39204ec95fa2b--