Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ABF9C10EFA for ; Fri, 30 May 2014 02:13:30 +0000 (UTC) Received: (qmail 20033 invoked by uid 500); 30 May 2014 02:13:28 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 19998 invoked by uid 500); 30 May 2014 02:13:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 19990 invoked by uid 99); 30 May 2014 02:13:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 May 2014 02:13:28 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ben@instaclustr.com designates 209.85.220.48 as permitted sender) Received: from [209.85.220.48] (HELO mail-pa0-f48.google.com) (209.85.220.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 May 2014 02:13:24 +0000 Received: by mail-pa0-f48.google.com with SMTP id rd3so1141458pab.7 for ; Thu, 29 May 2014 19:13:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=Lv0N3Po59ssKkE9hJ9ja+TMJviIHJVveW6Vlp9j5yS4=; b=OMKvT/o9lgOQEU2vcQq3++Nb17w1fVUyIkfSkJ13DXCvfcpjbtwOQEmG3D8pIT8dNm sBefSTDsb3R3/aBARtLdHpiYgnDjqTkvqr62QoeDjbM7nQpsTYSBBxvMn0Jegemj686R XW2K+1jpWf4ITAcb0HSaheKENGeqYiO1hA8VqgVeLfsWPjPb7DiG3h3JHogMvlmUUNWf IwyNYxfLoT2ULlSu2UfScgy3R9L6smdRar+RFIb6Sbq/iC//avG5Q29B7ed0XfkE0RmJ NH705mUptJhjJRGR6PF9P48siAwmJ6DDrqhAvBiiIRBiN2RDrHXo22DOujpJitBIX/3t rgEA== X-Gm-Message-State: ALoCoQnKOxiNqR/5aqOa/z/E4UegzhkIlWW5C9AXI9I8AviSbyMWj2Pz4oQidIme/GYhsAM9A/1K X-Received: by 10.68.164.67 with SMTP id yo3mr14384002pbb.104.1401415984000; Thu, 29 May 2014 19:13:04 -0700 (PDT) Received: from [192.168.1.25] (124-168-208-244.dyn.iinet.net.au. [124.168.208.244]) by mx.google.com with ESMTPSA id yx3sm3502279pbb.6.2014.05.29.19.13.02 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 29 May 2014 19:13:03 -0700 (PDT) From: Ben Bromhead Content-Type: multipart/alternative; boundary="Apple-Mail=_6CB36475-501A-4954-98FD-B3599ECCD6DB" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) Subject: Re: Multi-DC Environment Question Date: Fri, 30 May 2014 12:13:00 +1000 References: <5387B31E.7030206@gmail.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1822) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_6CB36475-501A-4954-98FD-B3599ECCD6DB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Short answer: If time elapsed > max_hint_window_in_ms then hints will stop being = created. You will need to rely on your read consistency level, read = repair and anti-entropy repair operations to restore consistency. Long answer: = http://www.slideshare.net/jasedbrown/understanding-antientropy-in-cassandr= a Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359 On 30 May 2014, at 8:40 am, Tupshin Harper wrote: > When one node or DC is down, coordinator nodes being written through = will notice this fact and store hints (hinted handoff is the mechanism), = and those hints are used to send the data that was not able to be = replicated initially. >=20 > http://www.datastax.com/dev/blog/modern-hinted-handoff >=20 > -Tupshin >=20 > On May 29, 2014 6:22 PM, "Vasileios Vlachos" = wrote: > Hello All, >=20 > We have plans to add a second DC to our live Cassandra environment. = Currently RF=3D3 and we read and write at QUORUM. After adding DC2 we = are going to be reading and writing at LOCAL_QUORUM. >=20 > If my understanding is correct, when a client sends a write request, = if the consistency level is satisfied on DC1 (that is RF/2+1), success = is returned to the client and DC2 will eventually get the data as well. = The assumption behind this is that the the client always connects to DC1 = for reads and writes and given that there is a site-to-site VPN between = DC1 and DC2. Therefore, DC1 will almost always return success before DC2 = (actually I don't know if it is possible for DC2 to be more up-to-date = than DC1 with this setup...). >=20 > Now imagine DC1 looses connectivity and the client fails over to DC2. = Everything should work fine after that, with the only difference that = DC2 will be now handling the requests directly from the client. After = some time, say after max_hint_window_in_ms, DC1 comes back up. My = question is how do I bring DC1 up to speed with DC2 which is now more = up-to-date? Will that require a nodetool repair on DC1 nodes? Also, what = is the answer when the outage is < max_hint_window_in_ms instead? >=20 > Thanks in advance! >=20 > Vasilis > --=20 > Kind Regards, >=20 > Vasileios Vlachos --Apple-Mail=_6CB36475-501A-4954-98FD-B3599ECCD6DB Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii
Short answer:

If time elapsed > max_hint_window_in_ms then hints will stop being created. You will need to rely on your read consistency level, read repair and anti-entropy repair operations to restore consistency.

Long answer:

http://www.slideshare.net/jasedbrown/understanding-antientropy-in-cassandra

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 30 May 2014, at 8:40 am, Tupshin Harper <tupshin@tupshin.com> wrote:

When one node or DC is down, coordinator nodes being written through will notice this fact and store hints (hinted handoff is the mechanism),  and those hints are used to send the data that was not able to be replicated initially.

http://www.datastax.com/dev/blog/modern-hinted-handoff

-Tupshin

On May 29, 2014 6:22 PM, "Vasileios Vlachos" <vasileiosvlachos@gmail.com> wrote:
Hello All,

We have plans to add a second DC to our live Cassandra environment. Currently RF=3 and we read and write at QUORUM. After adding DC2 we are going to be reading and writing at LOCAL_QUORUM.

If my understanding is correct, when a client sends a write request, if the consistency level is satisfied on DC1 (that is RF/2+1), success is returned to the client and DC2 will eventually get the data as well. The assumption behind this is that the the client always connects to DC1 for reads and writes and given that there is a site-to-site VPN between DC1 and DC2. Therefore, DC1 will almost always return success before DC2 (actually I don't know if it is possible for DC2 to be more up-to-date than DC1 with this setup...).

Now imagine DC1 looses connectivity and the client fails over to DC2. Everything should work fine after that, with the only difference that DC2 will be now handling the requests directly from the client. After some time, say after max_hint_window_in_ms, DC1 comes back up. My question is how do I bring DC1 up to speed with DC2 which is now more up-to-date? Will that require a nodetool repair on DC1 nodes? Also, what is the answer when the outage is <
max_hint_window_in_ms instead?

Thanks in advance!

Vasilis

-- 
Kind Regards,

Vasileios Vlachos

--Apple-Mail=_6CB36475-501A-4954-98FD-B3599ECCD6DB--