Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 10D049523 for ; Mon, 5 Mar 2012 16:35:12 +0000 (UTC) Received: (qmail 43760 invoked by uid 500); 5 Mar 2012 16:35:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 43730 invoked by uid 500); 5 Mar 2012 16:35:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 43722 invoked by uid 99); 5 Mar 2012 16:35:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Mar 2012 16:35:09 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [216.129.106.114] (HELO zen.heyx.com) (216.129.106.114) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Mar 2012 16:35:02 +0000 Received: from pptp-230.corp.wink.com (64-71-1-165.static.wiline.com [64.71.1.165]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by zen.heyx.com (Postfix) with ESMTPSA id 5F36B40090; Mon, 5 Mar 2012 08:34:41 -0800 (PST) Message-ID: <4F54EB20.2080804@koblas.com> Date: Mon, 05 Mar 2012 08:34:40 -0800 From: David Koblas User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: user@cassandra.apache.org CC: Jeremiah Jordan Subject: Re: Adding a second datacenter References: <4F54DEF2.1080207@koblas.com> <4F54E3A3.4060802@morningstar.com> In-Reply-To: <4F54E3A3.4060802@morningstar.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Jeremiah, Thanks! I'm running 1.0.8, two interesting things to note: - I don't have sufficient disk space to handle the straight bump to a replication factor of 4, so I think I'm going to have to do it one by one (1,2,3 and 4) with a bunch of cleanups in between. - Also, using a LOCAL_QUORUM doesn't work since my application has a hard response time limit then my read speed ends up being the speed of the slowest node. What I want is LOCAL_ONE which doesn't exist in the API (unless I missed something). Yes, CASSANDRA-3483 is really what I'm looking for. --david On 3/5/12 8:02 AM, Jeremiah Jordan wrote: > You need to make sure your clients are reading using LOCAL_* settings > so that they don't try to get data from the other data center. But > you shouldn't get errors while replication_factor is 0. Once you > change the replication factor to 4, you should get missing data if you > are using LOCAL_* for reading. > > What version are you using? > > See the IRC logs at the begining of this JIRA discussion thread for > some info: > > https://issues.apache.org/jira/browse/CASSANDRA-3483 > > But you should be able to: > 1. Set dc2:0 in the replication_factor. > 2. Set bootstrap to false on the new nodes. > 2. Start all of the new nodes. > 3. Change replication_factor to dc2:4 > 4. run repair on the nodes in dc2. > > Once the repairs finish you should be able to start using DC2. You > are still going to need a bunch of extra space because the repair is > going to get you a couple copies of the data. > > Once 1.1 comes out it will have new nodetool commands for making this > a little nicer per CASSANDRA-3483 > > -Jeremiah > > > On 03/05/2012 09:42 AM, David Koblas wrote: >> Everything that I've read about data centers focuses on setting >> things up at the beginning of time. >> >> I've the the following situation: >> >> 10 machines in a datacenter (DC1), with replication factor of 2. >> >> I want to set up a second data center (DC2) with the following >> configuration: >> 20 machines with a replication factor of 4 >> >> What I've found is that if I initially start adding things, the first >> machine to join the network attempts to replicate all of the data >> from DC1 and fills up it's disk drive. I've played with setting the >> storage_options to have a replication factor of 0, then I can bring >> up all 20 machines in DC2 but then start getting a huge number of >> read errors from read on DC1. >> >> Is there a simple cookbook on how to add a second DC? I'm currently >> trying to set the replication factor to 1 and do a repair, but that >> doesn't feel like the right approach. >> >> Thanks, >> >> >>