Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1270010B9D for ; Wed, 19 Jun 2013 18:28:20 +0000 (UTC) Received: (qmail 30014 invoked by uid 500); 19 Jun 2013 18:28:17 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 29984 invoked by uid 500); 19 Jun 2013 18:28:17 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 29976 invoked by uid 99); 19 Jun 2013 18:28:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jun 2013 18:28:17 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rcoli@eventbrite.com designates 209.85.216.45 as permitted sender) Received: from [209.85.216.45] (HELO mail-qa0-f45.google.com) (209.85.216.45) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jun 2013 18:28:13 +0000 Received: by mail-qa0-f45.google.com with SMTP id ci6so623429qab.18 for ; Wed, 19 Jun 2013 11:27:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=NK6CONS7RNebpsLaUqkKQug3Atc028oATlZYO0IjIpQ=; b=XV3/jOE/0rK9NNG2H8kmQidGolzHJ7r4KDUGZaQg5D5Zg3qoEMgbKHzhQlrjeX98Fk lUXlAqxc7E+hWuo1gnji2f/iI8uWTOijyAVu3oiQ60xWYpfKr2HQcUihC4PJhk1nZugk DVetA7NYk6PwQFONCg4ZRp9m1ZwHWEsL9KafrMnTzK7Ba+Py9/Pb5cu20wE9EfXoTE1E Pksv2uStLpvjrLAKAuTQ7e6hEcric3/nD1KRWfssyt29g1b+ZnYjuYedGIS9TOjoJkSp Z8lOGzU5r8t41+4q55bxpQUvart7VJkrWO8GPO4VvTV53p0zEGSxSt3jNHNa+T5mLKDT Mwng== MIME-Version: 1.0 X-Received: by 10.49.85.4 with SMTP id d4mr5283051qez.10.1371666472654; Wed, 19 Jun 2013 11:27:52 -0700 (PDT) Received: by 10.49.5.135 with HTTP; Wed, 19 Jun 2013 11:27:52 -0700 (PDT) In-Reply-To: <20130619175045.GA2194@quantcast.com> References: <20130619175045.GA2194@quantcast.com> Date: Wed, 19 Jun 2013 11:27:52 -0700 Message-ID: Subject: Re: Joining distinct clusters with the same schema together From: Robert Coli To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQlQ90Vt4IF1JzxcD31R3oSOMciFuKY5uv/Fo46Bgj9z8dDi0A1yhHP0D2VDEB3Q4ufNxrmp X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Jun 19, 2013 at 10:50 AM, Faraaz Sareshwala wrote: > Each datacenter will have a cassandra cluster with a separate set of seeds > specific to that datacenter. However, the cluster name will be the same. > > Question 1: is this enough to guarentee that the three datacenters will have > distinct cassandra clusters as well? Or will one node in datacenter A still > somehow be able to join datacenter B's ring. If they have network connectivity and the same cluster name, they are the same logical cluster. However if your nodes share tokens and you have auto_bootstrap=yes (the implicit default) the second node you attempt to start will refuse to start because you are trying to bootstrap it into the range of a live node. > For now, we are planning on using our own relay mechanism to transfer > data changes from one datacenter to another. Are you planning to use the streaming commitlog functionality for this? Not sure how you would capture all changes otherwise, except having your app just write the same thing to multiple places? Unless data timestamps are identical between clusters, otherwise identical data will not merge properly, as cassandra uses data timestamps to merge. > Question 2: is this a sane strategy? On its face my answer is "not... really"? What do you view yourself as getting with this technique versus using built in replication? As an example, you lose the ability to do LOCAL_QUORUM vs EACH_QUORUM consistency level operations? > Question 3: eventually, we want to turn all these cassandra clusters into one > large multi-datacenter cluster. What's the best practice to do this? Should I > just add nodes from all datacenters to the list of seeds and let cassandra > resolve differences? Is there another way I don't know about? If you are using NetworkTopologyStrategy and have the same cluster name for your isolated clusters, all you need to do is : 1) configure NTS to store replicas on a per-datacenter basis 2) ensure that your nodes are in different logical data centers (by default, all nodes are in DC1/rack1) 3) ensure that clusters are able to reach each other 4) ensure that tokens do not overlap between clusters (the common technique with manual token assignment is that each node gets a range which is off-by-one) 5) ensure that all nodes seed lists contain (recommended) 3 seeds from each DC 6) rolling restart (so the new seed list is picked up) 7) repair ("should" only be required if writes have not replicated via your out of band mechanism) Vnodes change the picture slightly because the chance of your clusters having conflicting tokens increases with the number of token ranges you have. =Rob