Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 57975 invoked from network); 15 Apr 2010 15:16:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Apr 2010 15:16:54 -0000 Received: (qmail 69214 invoked by uid 500); 15 Apr 2010 15:16:53 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 69197 invoked by uid 500); 15 Apr 2010 15:16:53 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 69189 invoked by uid 99); 15 Apr 2010 15:16:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 15:16:53 +0000 X-ASF-Spam-Status: No, hits=2.4 required=10.0 tests=AWL,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rantav@gmail.com designates 74.125.83.172 as permitted sender) Received: from [74.125.83.172] (HELO mail-pv0-f172.google.com) (74.125.83.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 15:16:47 +0000 Received: by pvf33 with SMTP id 33so888487pvf.31 for ; Thu, 15 Apr 2010 08:16:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=vvktOsmF4Xlz/3f5x7jhpsTdwXz2w5CzzN4Berokycw=; b=nF1x/aRNd4J8YpC0aeIByyWFfrgYazYKTfygkC3bsj1iz3fph4k++upi3pxQzHgfN6 5s6WjkgSGbe97i+fOdTgsgHIlAFZa0CO9xLdjVFcxmXx46c9kHrSpPIgmJDnAWyB3+St VNJ02tigdJL5gJv7M/kgzAKTRU9NfTMP9NPxQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=TuR6M3az3FS6ZY3xCKLcoUDluHaxD8AWifVQcXLSARn6OI9vYKlY3htiXFE+FXMr7N 16cuUWXXgkXaMiC4SGDyEytHBxYAR/5ijTKnJx3WIZCjTRotCvl47mICucfY94cNwBcb MyHeZyzIbKWvtwcOH2lFB9H3/Y1JRKYAbT/cQ= MIME-Version: 1.0 Received: by 10.231.172.12 with HTTP; Thu, 15 Apr 2010 08:16:26 -0700 (PDT) In-Reply-To: References: Date: Thu, 15 Apr 2010 18:16:26 +0300 Received: by 10.142.74.6 with SMTP id w6mr181546wfa.249.1271344586677; Thu, 15 Apr 2010 08:16:26 -0700 (PDT) Message-ID: Subject: RackAware and replication strategy From: Ran Tavory To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001636e1fa8c79dd44048447fa85 --001636e1fa8c79dd44048447fa85 Content-Type: text/plain; charset=UTF-8 I'm reading this on this page http://wiki.apache.org/cassandra/ArchitectureInternals : AbstractReplicationStrategy controls what nodes get secondary, tertiary, > etc. replicas of each key range. Primary replica is always determined by the > token ring (in TokenMetadata) but you can do a lot of variation with the > others. RackUnaware just puts replicas on the next N-1 nodes in the ring. > RackAware puts the first non-primary replica in the next node in the ring in > ANOTHER data center than the primary; then the remaining replicas in the > same as the primary. So I just want to make sure I got this right and that documentation is up to date. I have two data centers and rack-aware. When replication factor is 2: is it always the case that the primary replica goes to one DC and the second replica to the second DC? When replication factor is 3: First replica in DC1, second in DC2 and third in DC1 When replication factor is 4: First replica in DC1, second in DC2, third in DC1, fourth in DC1 etc If I have 4 hosts in each DC, which replication factors make sense? N=1 - When I don't care about losing data, cool N=2 - When I want to make sure each DC has a copy; useful for local fast access and allows recovery if only one host down. N=3 - If I want to make sure each DC has a copy plus recovery can be made faster in certain cases, and more resilient to two hosts down. N=4 - Like N=3 but even more resilient. etc Say I want to have two replicas in each DC, can this be done? --001636e1fa8c79dd44048447fa85 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm r= eading this on this page=C2=A0http://wiki.apache.org/cassandra/Ar= chitectureInternals=C2=A0:

AbstractReplicationStrategy controls what nodes get secondary, tertiary, et= c. replicas of each key range. Primary replica is always determined by the = token ring (in TokenMetadata) but you can do a lot of variation with the ot= hers. RackUnaware just puts replicas on the next N-1 nodes in the ring. Rac= kAware puts the first non-primary replica in the next node in the ring in A= NOTHER data center than the primary; then the remaining replicas in the sam= e as the primary.

So I just want to make sure I got this right and that d= ocumentation is up to date.
I have two data centers and rack-awar= e.

When replication factor is 2: is it always the = case that the primary replica goes to one DC and the second replica to the = second DC?
When replication factor is 3: First replica in DC1, second in DC2 and = third in DC1
When replication factor is 4:=C2=A0First replica in = DC1, second in DC2, third in DC1, fourth in DC1 etc

If I have 4 hosts in each DC, which replication factors make sense?
N=3D1 - When I don't care about losing data, cool
N=3D2 - = When I want to make sure each DC has a copy; useful for local fast access a= nd allows recovery if only one host down.
N=3D3 -=C2=A0If I want to make sure each DC has a copy plus recovery c= an be made faster in certain cases, and more resilient to two hosts down.
N=3D4 - Like N=3D3 but even more resilient. etc

Say I want to have two replicas in each DC, can this be done?

--001636e1fa8c79dd44048447fa85--