From user-return-17871-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Jun 17 17:57:46 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DDC734D8B for ; Fri, 17 Jun 2011 17:57:46 +0000 (UTC) Received: (qmail 98423 invoked by uid 500); 17 Jun 2011 17:57:44 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 98386 invoked by uid 500); 17 Jun 2011 17:57:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 98378 invoked by uid 99); 17 Jun 2011 17:57:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 17:57:44 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [204.13.248.66] (HELO mho-01-ewr.mailhop.org) (204.13.248.66) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 17:57:35 +0000 Received: from 67-6-248-180.hlrn.qwest.net ([67.6.248.180] helo=[192.168.0.2]) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.72) (envelope-from ) id 1QXdI9-000P7t-95 for user@cassandra.apache.org; Fri, 17 Jun 2011 17:57:13 +0000 X-Mail-Handler: MailHop Outbound by DynDNS X-Originating-IP: 67.6.248.180 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/mailhop/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1/Vz42aA5FA79YOzL8xar4TXwGNPs+3OyY= Message-ID: <4DFB956F.9050202@dude.podzone.net> Date: Fri, 17 Jun 2011 11:57:03 -0600 From: AJ User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Docs: Token Selection References: <4DF7E4E5.90709@dude.podzone.net> <4DF8197F.1040501@dude.podzone.net> <4DF907CF.2060302@dude.podzone.net> <4DF92549.2040208@dude.podzone.net> <4DF96965.3070108@dude.podzone.net> <8A999437-1E50-4EC8-9C22-AE5DC7D1AC2B@thelastpickle.com> <4DFA07A9.9000204@dude.podzone.net> <09ED7B35-6020-4E70-9949-67B545536F67@thelastpickle.com> <4DFAE43A.9030103@dude.podzone.net> <4DFB8049.7010501@dude.podzone.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 6/17/2011 10:31 AM, Eric tamme wrote: >> What I don't like about NTS is I would have to have more replicas than I >> need. {DC1=2, DC2=2}, RF=4 would be the minimum. If I felt that 2 local >> replicas was insufficient, I'd have to move up to RF=6 which seems like a >> waste... I'm predicting data in the TB range so I'm trying to keep replicas >> to a minimum. >> >> My goal is to have 2-3 replicas in a local data center and 1 replica in >> another dc. I think that would be enough barring a major catastrophe. But, >> I'm not sure this is possible. I define "local" as in the same data center >> as the client doing the insert/update. > Yes, not being able to configure the replication factor differently > for each data center is a bit annoying. Im assuming you basically > want DC1 to have a replication factor of {DC1:2, DC2:1} and DC2 to > have {DC1:1,DC2:2}. Yes. But, the more I think about it, the more I see issues. Here is what I envision (Issues marked with *): Three or more dc's, each serving as fail-overs for the others with 1 maximum unavailable dc supported at a time. Each dc is a production dc serving users that I choose. Each dc also stores 0-1 replicas from the other dc's. Direct customers to their "home" dc of my choice. Data coming from the client local to the dc is replicated X times in the local dc and 1 time in any other dc (randomly). In the even a dc becomes unreachable by users, an arbitrary fail-over dc can serve their requests albeit with increased latency. *There will only be 1 replica left amongst the remaining fail-over dc's, so this could be a problem depending on the CL used other than CL.ONE. *During the fail-over state, the cluster needs to know that the real "home" of the replicas belongs to the currently unavailable dc. But, as of now, I don't think that's possible and so new writes will start to be replicated in the current dc as if the currently-used fail-over dc is the home dc. Maybe these goals can be achieve with a kind of ordered asymmetrical replication strategy like you illustrated above. The hard part will be to figure out a simple and elegant way to do this w/o undermining C*. > I would very much like that feature as well, but I dont know the > feasibility of it. > > -Eric >