Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E58A310A9A for ; Tue, 3 Dec 2013 00:00:34 +0000 (UTC) Received: (qmail 49898 invoked by uid 500); 3 Dec 2013 00:00:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 49834 invoked by uid 500); 3 Dec 2013 00:00:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 49821 invoked by uid 99); 3 Dec 2013 00:00:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 00:00:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rcoli@eventbrite.com designates 209.85.220.177 as permitted sender) Received: from [209.85.220.177] (HELO mail-vc0-f177.google.com) (209.85.220.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 00:00:25 +0000 Received: by mail-vc0-f177.google.com with SMTP id hv10so8811227vcb.8 for ; Mon, 02 Dec 2013 16:00:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=YQL2KllTL+eCxroaWcKUe5ilDDWf68K2dbPFt98Jo2s=; b=dh5ei9oVHqxVVUWrIATphhjDyO4vIcS+1/15BbmeqWCmf9RGrTvGJ/6NgYRlVAotLc chwRVeoDpdipltS0YIgfEPvPskK3wBGGS63sHHgrt4k3wsr7FbsaL7iokj1YboQgHbXx y6IER6eZKz+PzrrYMZLB6v8AmBIgj4aX8kZdaDTYj0Ccfahob493zG8Fl+5DnRAgJzLg eiEZMoL7WUkZ4kV3NV2DhOPC4dQ6UpORDclC1tyzu4NRDAPFwZj4IRzn+vV35G7u/Edz CsqBdmBfYM/yhb3UEhIRvWWG6Zz6VLEAWxOxGZdMNoNNKwtIPLHMRmfAQzMcaOV/xfiu eMsw== X-Gm-Message-State: ALoCoQk/Zp7pgsUTpMEKbGJLCaOhJl7D90dP1mn+BLBfWXurbKDhysypkjTIaTQ7IAAAl0LtyhP4 MIME-Version: 1.0 X-Received: by 10.52.164.203 with SMTP id ys11mr2546977vdb.37.1386028804524; Mon, 02 Dec 2013 16:00:04 -0800 (PST) Received: by 10.220.150.3 with HTTP; Mon, 2 Dec 2013 16:00:04 -0800 (PST) In-Reply-To: References: Date: Mon, 2 Dec 2013 16:00:04 -0800 Message-ID: Subject: Re: changing several things (almost) at once; is this the right order to make the changes? From: Robert Coli To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=001a11c2cc408af39204ec95fa2b X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2cc408af39204ec95fa2b Content-Type: text/plain; charset=ISO-8859-1 On Mon, Dec 2, 2013 at 1:08 PM, Brian Tarbox wrote: > We're making several changes and I'd to confirm that our order of making > them is reasonable. Right now we have 4 node system at replicationFactor=2 > running 1.1.6. > > We've moving to a 6 node system at rf=3 running 1.2.12 (I guess). > > We think the order should be: > 1) change to rf=3 and run repair on all nodes while still at 1.1.6 > Yes, being aware that you will get false "no data" reads from the third replica at CL.ONE until your repair completes. > 2) upgrade to 1.1.10 (latest on that branch?) > Unless NEWS.txt specifies that you need to do this, you can probably skip it. From memory, I believe you can skip it. > 3) upgrade to 1.2.12 (latest on that branch?) > Yes. > 4) run the convert-to-v_Node command > If you mean shuffle, I feel bound to tell you that no one has successfully run shuffle on an actual production cluster[1]. I conjecture that you are in production because you are running 1.1.x. You might be the first to successfully run shuffle in production, but you probably do not want to try to be? > 5) add two more servers > If you're going to add servers anyway, you might want to do the "new datacenter(s)" process for upgrading to Vnodes. =Rob [1] rbranson apparently did a shuffle-like activity successfully, but by adding two additional DCs, one with a node with enough disk space to hold the entire cluster's data... --001a11c2cc408af39204ec95fa2b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Mon, Dec 2, 2013 at 1:08 PM, Brian Tarbox <tarb= ox@cabotresearch.com> wrote:
We're making several ch= anges and I'd to confirm that our order of making them is reasonable. = =A0Right now we have 4 node system at replicationFactor=3D2 running 1.1.6.<= div>
We've moving to a 6 node system at rf=3D3 running 1.2.12= (I guess).

We think the order should be:
1) change to rf= =3D3 and run repair on all nodes while still at 1.1.6=A0

Yes, being aware that you will get false "no= data" reads from the third replica at CL.ONE until your repair comple= tes.
=A0
2) upgra= de to 1.1.10 (latest on that branch?)

Unless NEWS.txt specifies that you need to do this, you can probably skip i= t. From memory, I believe you can skip it.
=A0
3) upgrade to 1.2.12 (latest on that branch?)

Yes.
=A0
4) run the convert-to-v_Node command

=
If you mean shuffle, I feel bound to tell you that no one has su= ccessfully run shuffle on an actual production cluster[1]. I conjecture tha= t you are in production because you are running 1.1.x.

You might be the first to successfully run shuffle in p= roduction, but you probably do not want to try to be?
=A0
5) add two more servers
=
If you're going to add servers anyway, you might want to= do the "new datacenter(s)" process for upgrading to Vnodes.

=3DRob=A0
[1] rbranson apparently did a shuff= le-like activity successfully, but by adding two additional DCs, one with a= node with enough disk space to hold the entire cluster's data...
--001a11c2cc408af39204ec95fa2b--