Return-Path: Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: (qmail 46831 invoked from network); 17 Jun 2010 17:50:55 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Jun 2010 17:50:55 -0000 Received: (qmail 91170 invoked by uid 500); 17 Jun 2010 17:50:54 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 91077 invoked by uid 500); 17 Jun 2010 17:50:53 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Delivered-To: moderator for dev@cassandra.apache.org Received: (qmail 81261 invoked by uid 99); 17 Jun 2010 17:45:07 -0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) X-Virus-Scanned: amavisd-new at ceid.upatras.gr Message-ID: <60143.94.64.225.5.1276796672.squirrel@webmail.ceid.upatras.gr> Date: Thu, 17 Jun 2010 20:44:32 +0300 (EEST) Subject: Cassandra Multiple DataCenter Suitability - why? From: altanis@ceid.upatras.gr To: dev@cassandra.apache.org User-Agent: SquirrelMail/1.4.8-5.el5_3.7 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Hello, I keep reading everywhere that Cassandra has supported multiple datacenters from the beginning. I would like to know what does Cassandra do to achieve that. Is it just that the developers have written some code that supports that scenario, or is there something inherent in Cassandra's design that is suitable for a multi DC environment, like minimizing inter-DC traffic? I have read about RackAwareStrategy on the wiki, and have also browsed through some code (DataCenterShardStrategy), but I would like to see what people have to say about this. I also read about an implemenetation of Rack Awareness employing Zookeeper, but I gather that wasn't released by Facebook and it was more geared towards single-DC rack awareness because Zookeeper is a bit heavy on the bandwidth. Anyway, just to sum it up, my question is this: please explain in brief the reasons why Cassandra is well suited for multi-DC environments. Alexander Altanis