Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 39024 invoked from network); 6 Apr 2011 16:41:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Apr 2011 16:41:53 -0000 Received: (qmail 85795 invoked by uid 500); 6 Apr 2011 16:41:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 85763 invoked by uid 500); 6 Apr 2011 16:41:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85755 invoked by uid 99); 6 Apr 2011 16:41:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 16:41:51 +0000 X-ASF-Spam-Status: No, hits=1.3 required=5.0 tests=FS_REPLICA,RCVD_IN_DNSWL_MED,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [141.211.14.81] (HELO hackers.mr.itd.umich.edu) (141.211.14.81) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Apr 2011 16:41:46 +0000 Received: FROM mail-bw0-f44.google.com (mail-bw0-f44.google.com [209.85.214.44]) By hackers.mr.itd.umich.edu ID 4D9C97B2.E0029.918 ; Authuser stgyd; 6 Apr 2011 12:41:23 EDT Received: by bwz13 with SMTP id 13so1484817bwz.31 for ; Wed, 06 Apr 2011 09:41:21 -0700 (PDT) Received: by 10.204.47.87 with SMTP id m23mr1069282bkf.156.1302108067111; Wed, 06 Apr 2011 09:41:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.128.154 with HTTP; Wed, 6 Apr 2011 09:40:47 -0700 (PDT) In-Reply-To: References: <4b72d650b0cd8fec918db3ddc4c49d08@umich.edu> From: Yudong Gao Date: Wed, 6 Apr 2011 12:40:47 -0400 Message-ID: Subject: Re: Location-aware replication based on objects' access pattern To: user@cassandra.apache.org Cc: Sasha Dolgy Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Apr 6, 2011 at 3:55 AM, Sasha Dolgy wrote: > I had been asked this question from a strategy point of view, and > referenced how linkedin.com appears to handle this. > > > Specific region data is stored on a ring in that region. =A0While based > in the middle east, my linkedin.com profile was kept on the middle > east part of linkedin.com ... when I moved back to europe, updated my > city, my profile shifted from the middle east to europe ... > > > would it not be easier to manage multiple rings (one in each required > geographic region) to suit the location aware use case? =A0This way you > can grow out that region as necessary and invest less into the regions > that aren't as busy ... > > would mean your application needs to be aware of the different regions > and where data exists ... or make some initial assumptions as to where > to find data ... > > - 1 ring for apac > - 1 ring for europe > - 1 ring for americas > - 1 global ring (with nodes present in each region) > > the global ring maintains reference data on which ring a guid exists ... > > I've been playing with this concept on AWS ... the amount of data I > have isn't significant, so I may not have run into problems that will > occur when i get to large amounts of data ... > This is interesting. But how do you design the global ring to make sure that it is not the bottleneck? For example, if a client need to access data in the US ring, but she need to first talk to a europe node to get the reference data, this will not be efficient. Another potential problem is that the data is not synchronized among the rings. If one data center goes down, the data stored there will get lost. One way to get around may be to use the NetworkTopologyStrategy. For example, with RF=3D3, for the ring in europe, we can specify 2 replicas in europe and 1 replica in america. Thanks! Yudong > -sd > > On Wed, Apr 6, 2011 at 9:26 AM, Jonathan Colby = wrote: >> good to see a discussion on this. >> >> This also has practical use for business continuity where you can contro= l that the clients in a given data center first write replicas to its own d= ata center, then to the other data center for backup. =A0If I understand co= rrectly, a write takes the token into account first, then the replication s= trategy decides where the replicas go. =A0 I would like to see the the firs= t writes to be based on "location" instead of token - =A0 whether that is a= ccomplished by manipulating the key or some other mechanism. >> >> That way, if you do suffer the loss of a data center, =A0the clients are= guaranteed to meet quorum on the nodes in its own data center =A0(given = =A0a mirrored architecture across 2 data centers). >> >> We have 2 data centers. =A0If one goes down we have the problem that quo= rum cannot be satisfied for half of the reads. >