Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 999BB11A10 for ; Thu, 18 Sep 2014 04:57:34 +0000 (UTC) Received: (qmail 72763 invoked by uid 500); 18 Sep 2014 04:57:34 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 72711 invoked by uid 500); 18 Sep 2014 04:57:34 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 72594 invoked by uid 99); 18 Sep 2014 04:57:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 04:57:33 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shralex@gmail.com designates 209.85.223.170 as permitted sender) Received: from [209.85.223.170] (HELO mail-ie0-f170.google.com) (209.85.223.170) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2014 04:57:29 +0000 Received: by mail-ie0-f170.google.com with SMTP id tp5so463158ieb.29 for ; Wed, 17 Sep 2014 21:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=yr1PLY4RIC2WUkL4+1uUg5kFfEYqJjlBt+Vwt+xP9kQ=; b=lVECq0Jnro0OU/rbTgh27yzPdtbWCGtyBwp0sXubiUALobXCmKmhUtEZliheTse2Tf AKh4PtlmRj4/fuJld5Phmh9O+hGnBpZsbUzwwiZ+t64HsTVaZJoX4nm538WkdhkzylQI MWp4I85D/zcfNj3s36423ShSpH+Vm9hG+WZtZCUwFRm5ivN8PYEEZiFgbv7+PlDfqwN2 gJC19ktYhNj8KPq3REn6HbbhyMUkicWdBIZLCEj/SS+mWoMhTj4ERaRp3P9PPViKRNkv dyt6P+NCks0ypylmgUAmw1FI8Cp+PF1DEtJB0ex0mGBEoej2aKfV1mudyn31wzr8OXz/ izpQ== X-Received: by 10.50.82.98 with SMTP id h2mr11633462igy.26.1411016228551; Wed, 17 Sep 2014 21:57:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.30.194 with HTTP; Wed, 17 Sep 2014 21:56:48 -0700 (PDT) In-Reply-To: References: From: Alexander Shraer Date: Wed, 17 Sep 2014 21:56:48 -0700 Message-ID: Subject: Re: Reconfig without quorum To: "user@zookeeper.apache.org" Content-Type: multipart/alternative; boundary=089e0111b2c61342ce05034fd163 X-Virus-Checked: Checked by ClamAV on apache.org --089e0111b2c61342ce05034fd163 Content-Type: text/plain; charset=UTF-8 Hi Martin, Yes, reconfig like other ZooKeeper operations works only when there's a quorum. Although you're saying that zone 1 failed, it may be the case that the link between zone 1 and zone 2 failed but the zones themselves are fine. In this case if we allow the zones to process commands, like reconfig or others, we will end up with split-brain and loose consistency. if you're sure that zone 1 is down you could shut down the servers in zone 2, change the configuration files to exclude zone 1 and restart. Note that when you restart you should bring the servers up in an order that wouldn't allow a quorum without someone with the latest state. Otherwise you'll loose data. Example: zone 1 has participant replicas A, B, C zone 2 has participants D, E, F. Latest state is on A, B, C, D. Zone 1 fails, you restart zone 2 servers, but E and F come up first. In this case you're likely to loose latest updates. Perhaps others can suggest a better solution, but you could consider having a tie breaker replica somewhere in a third location. Or if you don't need consistency between the zones you could run 2 separate zookeepers. Does your application require consistency between zones 1 and 2 ? Alex On Wed, Sep 17, 2014 at 1:19 PM, Martin Grotzke < martin.grotzke@googlemail.com> wrote: > Hi, > > is it true, that the reconfig command that's available since 3.5.0 can only > be used if there's a quorum? > > Our situation is that we have 2 datacenters (actually only 2 zones within > the same DC) which will be provisioned equally, so that we'll have an even > number of ZK nodes (true, not optimal). When 1 zone fails, there won't be a > quorum any more and ZK will be unavailable - that's my understanding. Is it > possible to add new nodes to the ZK cluster and achieve a quorum again > while the failed zone is still unavailable? > > What would you recommend how to handle this situation? > > We're using (going to use) SolrCloud as clients. > > Thanks && cheers, > Martin > --089e0111b2c61342ce05034fd163--