Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 93BEF1821C for ; Wed, 12 Aug 2015 09:11:47 +0000 (UTC) Received: (qmail 22799 invoked by uid 500); 12 Aug 2015 09:11:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 22754 invoked by uid 500); 12 Aug 2015 09:11:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 22740 invoked by uid 99); 12 Aug 2015 09:11:31 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Aug 2015 09:11:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 72B70181A94 for ; Wed, 12 Aug 2015 09:11:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.88 X-Spam-Level: ** X-Spam-Status: No, score=2.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OBnLCarb4Y19 for ; Wed, 12 Aug 2015 09:11:21 +0000 (UTC) Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 0D67440E1C for ; Wed, 12 Aug 2015 09:11:21 +0000 (UTC) Received: by wicja10 with SMTP id ja10so105170066wic.1 for ; Wed, 12 Aug 2015 02:11:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=xitIBm9dC/F7vX1SFMVn/xrc9sI5Yrwljb7Ov5mXoZQ=; b=D4qjYmi1+XXSyziuxIQA+3EJSW+aNOLklq44wzqKEJHKfbxAY0jFxrF9/NHL7rrSrr Ghiirf1yEUZpVfap5Dk3hlQOSlp+BRed1v5sW6mTncNtBv4xZsfL8YiiGTJdRl58Km6M yff8ZYjwSlqxULtTUFspaBs5aT8RUzi0c4TCG0jcDKNJzegCj+ruGSz8GOusNhyaDWUE EmHMuPeD+bQJWJlflqlT/dlxWKL7LxX8zRoR4jtSXZGSpm2IKKkIRgh00p3Ux3bVmzMY j1I8syYInD8OhbVCkELP+b443Tru20hcMwqtkOTBQA8shtQaZvloSYSVgcAt+ykQbWma TuzA== MIME-Version: 1.0 X-Received: by 10.180.84.230 with SMTP id c6mr46442798wiz.32.1439370680181; Wed, 12 Aug 2015 02:11:20 -0700 (PDT) Received: by 10.27.48.80 with HTTP; Wed, 12 Aug 2015 02:11:19 -0700 (PDT) Received: by 10.27.48.80 with HTTP; Wed, 12 Aug 2015 02:11:19 -0700 (PDT) In-Reply-To: References: <88E7978E-DDCB-41C6-8138-45AF31C5E840@crowdstrike.com> Date: Wed, 12 Aug 2015 11:11:19 +0200 Message-ID: Subject: Re: Duplicating a cluster with different # of disks From: Gerard Maas To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d044280cc17ab57051d199acc --f46d044280cc17ab57051d199acc Content-Type: text/plain; charset=UTF-8 Many thanks for confirming the procedure. I was doing the copy from 3->2 as explained before. My doubt came from noticing that the total count strongly differed from src to destination. 3M vs 150k. But small test tables with few hundred records all went well. Double checked the copy and the procedure was correct. It was a table we had issues with in the past (few very loooong rows). Maybe related to that? Kr, Gerard On Aug 6, 2015 11:00 PM, "Alain RODRIGUEZ" wrote: > I agree with Jeff, those 2 solution should work well indeed to have > distinct cluster (data will be fixed in time, not synchronised). > > It really depends on you but basically having hybride data storage > structures is not an issue at all in both cases as it is something that you > can set in the cassandra.yaml at the node level. > > C*heers, > > Alain > > 2015-08-06 22:41 GMT+02:00 Jeff Jirsa : > >> You can copy all of the sstables into any given data directory without >> issue (keep them within the keyspace/table directories, but the >> mnt/mnt2/mnt3 location is irrelevant). >> >> You can also stream them in via sstableloader if your ring topology has >> changed (especially if tokens have moved) >> >> >> >> From: Gerard Maas >> Reply-To: "user@cassandra.apache.org" >> Date: Thursday, August 6, 2015 at 9:50 AM >> To: "user@cassandra.apache.org" >> Subject: Duplicating a cluster with different # of disks >> >> Hi, >> >> I'm currently trying to duplicate a given keyspace on a new cluster to >> run some analytics on it. >> >> My source cluster has 3 disks and corresponding data directories (mnt, >> mnt2, mnt3) but the machines in my target cluster only have 2 disks (mnt, >> mnt2). >> >> What should be the correct procedure to copy the sstable data from >> source to destination in this case? >> >> -kr, Gerard. >> > > --f46d044280cc17ab57051d199acc Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Many thanks for confirming the procedure. I was doing the co= py from 3->2 as explained before. My doubt came from=C2=A0 noticing that= the total count strongly differed from src to destination. 3M vs 150k. But small test tables with few hundred records all went well.

Double checked the copy and the procedure was correct. It wa= s a table we had issues with in the past (few very loooong rows). Maybe rel= ated to that?

Kr, Gerard

On Aug 6, 2015 11:00 PM, "Alain RODRIGUEZ&q= uot; <arodrime@gmail.com> w= rote:
I agree with Jeff, those 2 solution should work well indeed to have dist= inct cluster (data will be fixed in time, not synchronised).

=
It really depends on you but basically having hybride data storage str= uctures is not an issue at all in both cases as it is something that you ca= n set in the cassandra.yaml at the node level.

C*h= eers,

Alain
=
2015-08-06 22:41 GMT+02:00 Jeff Jirsa <Jeff.Jirsa@crowdstrike.com>:
You can copy all of the sstab= les into any given data directory without issue (keep them within the keysp= ace/table directories, but the mnt/mnt2/mnt3 location is irrelevant).
=

You can also stream them in via sstableloader if your r= ing topology has changed (especially if tokens have moved)



<= div style=3D"font-family:Calibri;font-size:12pt;text-align:left;color:black= ;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDI= NG-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:med= ium none;PADDING-TOP:3pt">From: Ge= rard Maas
Reply-To: "user@cassandra.ap= ache.org"
Date: Thursd= ay, August 6, 2015 at 9:50 AM
To: "use= r@cassandra.apache.org"
Subjec= t: Duplicating a cluster with different # of disks

Hi,

I'm currently trying to duplicate a given keyspace = on a new cluster to run some analytics on it.

My s= ource cluster has 3 disks and corresponding data directories (mnt, mnt2, mn= t3) but the machines in my target cluster only have 2 disks (mnt, mnt2).

What should be the correct procedure to copy the sst= able data =C2=A0from source to destination in this case?

-kr, Gerard.

--f46d044280cc17ab57051d199acc--