Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3A886200C8E for ; Thu, 25 May 2017 00:35:29 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 38FD9160BD0; Wed, 24 May 2017 22:35:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 597BA160BB6 for ; Thu, 25 May 2017 00:35:28 +0200 (CEST) Received: (qmail 32264 invoked by uid 500); 24 May 2017 22:35:26 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 32252 invoked by uid 99); 24 May 2017 22:35:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 May 2017 22:35:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0FAD6C6AFD for ; Wed, 24 May 2017 22:35:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id PMb98830Lrts for ; Wed, 24 May 2017 22:35:21 +0000 (UTC) Received: from mail-yw0-f171.google.com (mail-yw0-f171.google.com [209.85.161.171]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 285345FC4D for ; Wed, 24 May 2017 22:35:21 +0000 (UTC) Received: by mail-yw0-f171.google.com with SMTP id b68so96084852ywe.3 for ; Wed, 24 May 2017 15:35:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=0xMU67WypQARTzOvumV/KJV1PA8o9d1i4zy1Lp+gr80=; b=X+oHPyXjhQuhPDb52GTndLIZJbiqR7uB9WyTlAfOBTzwqSHy+my7rIKyN53Xl3R6O0 PbWdIFYow1nHmFxwQTPPUVrUcOaYLPGUVyXIC4BoF/UDQSlgAuMS66+6t2f7hFwlfXU0 NNuWxZNxX/NvkSuGj8ARLXhHeHEHc3nvhCuKYylmzmkmWVoIQjGKJOi6yIedscYgjUI1 4A9/AREPf8RTiQYLAXJ/nXAhe+6MzhA5cRQqRH+XZ3PfRCJWQeBQu051Erea/6Ps5P2r WdveAFY/Ax9qyJEXKDHt0m00TmZzSgF/sA8sNZJX8f5w8Q4ZTGn9U3iBzmcmf/5qxUtM 880w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=0xMU67WypQARTzOvumV/KJV1PA8o9d1i4zy1Lp+gr80=; b=ZDckdXrAuBpX0ag/0hil6pJoqeV24kRatdvnMJ5ExB0JQKcG+NZNG4KdCbBakg78Eg cHVWXMUTBogMieqPi8nQgfC5Xh3nV4mDuU62hkeIMA2wnh+H92POIDNZEFC0HQtGWeDU EqyHFYTLL4C5Tv78hBmIO/fOn7DbBsHmVWZZTKoJr+nclLLL0I9QF6YUchS8H+MPnGS8 zMWpyFfkHL8Qd09okhNrL9PdpOUbhjpR1oPZ0yYfubutH8XIFEed2yLAv9kW3N6S3Urr N5/x2dBDfLIS+o6LMFY4x4FlbFwx+YVL6yA82pIvy9lbCO/kxQpI+1Z2nmw62Ml0dMKa wNcA== X-Gm-Message-State: AODbwcCQU01Ai8VeIJl/41/oYXOOCl3iP4rYiNee5h3pNn3dZImf+Ngp 3zZgjLH2buTT97fQvOLpwM8ZK6UJl2f2 X-Received: by 10.129.90.194 with SMTP id o185mr32166260ywb.61.1495665320307; Wed, 24 May 2017 15:35:20 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.207.202 with HTTP; Wed, 24 May 2017 15:35:19 -0700 (PDT) Received: by 10.37.207.202 with HTTP; Wed, 24 May 2017 15:35:19 -0700 (PDT) In-Reply-To: <93178F40-D201-4D8C-83A8-234A422C2D61@cominvent.com> References: <18751a00-848f-4bff-a3e5-ab1be7d356fb@elyograg.org> <93178F40-D201-4D8C-83A8-234A422C2D61@cominvent.com> From: Pushkar Raste Date: Wed, 24 May 2017 18:35:19 -0400 Message-ID: Subject: Re: Spread SolrCloud across two locations To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary="001a114922d21eaf6c05504cb896" archived-at: Wed, 24 May 2017 22:35:29 -0000 --001a114922d21eaf6c05504cb896 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable A setup I have used in the past was to have an observer I DC2. If DC1 one goes boom you need manual intervention to change observer's role to make it a follower. When DC1 comes back up change on instance in DC2 to make it a observer again On May 24, 2017 6:15 PM, "Jan H=C3=B8ydahl" wrote: > Sure, ZK does by design not support a two-node/two-location setup. But > still, users may want/need to deploy that, > and my question was if there are smart ways to make such a setup as littl= e > painful as possible in case of failure. > > Take the example of DC1: 3xZK and DC2: 2xZK again. And then DC1 goes BOOM= . > Without an active action DC2 would be read-only > What if then the Ops personnel in DC2 could, with a single script/command= , > instruct DC2 to resume =E2=80=9Cmaster=E2=80=9D role: > - Add a 3rd DC2 ZK to the two existing, reconfigure and let them sync up. > - Rolling restart of Solr nodes with new ZK_HOST string > Of course, they would also then need to make sure that DC1 does not boot > up again before compatible change has been done there too. > > -- > Jan H=C3=B8ydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 23. mai 2017 kl. 18.56 skrev Shawn Heisey : > > > > On 5/23/2017 10:12 AM, Susheel Kumar wrote: > >> Hi Jan, FYI - Since last year, I have been running a Solr 6.0 cluster > in one of lower env with 6 shards/replica in dc1 & 6 shard/replica in dc2 > (each shard replicated cross data center) with 3 ZK in dc1 and 2 ZK in dc= 2. > (I didn't have the availability of 3rd data center for ZK so went with on= ly > 2 data center with above configuration) and so far no issues. Its been > running fine, indexing, replicating data, serving queries etc. So in my > test, setting up single cluster across two zones/data center works withou= t > any issue when there is no or very minimal latency (in my case around 30m= s > one way > > > > With that setup, if dc2 goes down, you're all good, but if dc1 goes > down, you're not. > > > > There aren't enough ZK servers in dc2 to maintain quorum when dc1 is > unreachable, and SolrCloud is going to go read-only. Queries would most > likely work, but you would not be able to change the indexes at all. > > > > ZooKeeper with N total servers requires int((N/2)+1) servers to be > operational to maintain quorum. This means that with five total servers, > three must be operational and able to talk to each other, or ZK cannot > guarantee that there is no split-brain, so quorum is lost. > > > > ZK in two data centers will never be fully fault-tolerant. There is no > combination of servers that will work properly. You must have three data > centers for a geographically fault-tolerant cluster. Solr would be > optional in the third data center. ZK must be installed in all three. > > > > Thanks, > > Shawn > > > > --001a114922d21eaf6c05504cb896--