Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
Date: Sun, 21 Jun 2015 23:25:48 -0400
Message-ID: 
 <CACCLA56tjYSPqS6ndQZO7dE4TbqsVS2sMERzPCcCgRB0wHPo6g@mail.gmail.com>
Subject: Create a smaller cluster based on snapshost
From: John Wong <gokoproject@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a113fd05a76e871051912d468

--001a113fd05a76e871051912d468
Content-Type: text/plain; charset=UTF-8

Hi.

Supposed I have a 6-node cluster running and I want to build a 3-node
cluster based on that 6-node cluster. What is the recommended way to
quickly build such cluster? Each node is about 120Gb and we have RF=3. We
are on Cassandra 1.2.19 and we are not using vnode.

My initial research shows it can either be done with sstableloader or
restore using snapshot and fix the token range.

In the case of sstableloader, given it is streaming, and we are restoring
from a live server, this seems to be a slow process if we throttle the
traffic. Even if I take this route, do I just pick 3 node out of the 6
nodes in any random order?

In the case of restoring from snapshots I have restored a 6-node replica
with just copying snapshot files (along with schema files), run nodetool
refresh, and should be able to complete in a few hours. But now with
smaller replica, do I again just pick snapshots from any 3 nodes? What and
why do I need to fix token range (from what I read)?

Any feedback is appreciated.

Thanks.

John

--001a113fd05a76e871051912d468
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div>Hi.<br><br></div>Supposed I have a 6-node c=
luster running and I want to build a 3-node cluster based on that 6-node cl=
uster. What is the recommended way to quickly build such cluster? Each node=
 is about 120Gb and we have RF=3D3. We are on Cassandra 1.2.19 and we are n=
ot using vnode.<br><br></div><div>My initial research shows it can either b=
e done with sstableloader or restore using snapshot and fix the token range=
.<br><br></div><div>In the case of sstableloader, given it is streaming, an=
d we are restoring from a live server, this seems to be a slow process if w=
e throttle the traffic. Even if I take this route, do I just pick 3 node ou=
t of the 6 nodes in any random order?<br><br></div><div>In the case of rest=
oring from snapshots I have restored a 6-node replica with just copying sna=
pshot files (along with schema files), run nodetool refresh, and should be =
able to complete in a few hours. But now with smaller replica, do I again j=
ust pick snapshots from any 3 nodes? What and why do I need to fix token ra=
nge (from what I read)?<br><br></div><div>Any feedback is appreciated.<br><=
br></div><div>Thanks.<br><br></div><div>John<br></div></div></div>

--001a113fd05a76e871051912d468--