Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C1924CCDE for ; Fri, 18 May 2012 20:57:47 +0000 (UTC) Received: (qmail 70151 invoked by uid 500); 18 May 2012 20:57:47 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 70120 invoked by uid 500); 18 May 2012 20:57:47 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 70112 invoked by uid 99); 18 May 2012 20:57:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 20:57:47 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jzimmerman@netflix.com designates 69.53.237.163 as permitted sender) Received: from [69.53.237.163] (HELO exout104.netflix.com) (69.53.237.163) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 20:57:40 +0000 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; s=s1024;d=netflix.com; h=from:to:cc:subject:date:message-id:in-reply-to:content-type:mime-version; bh=44Q7DJA47+ALJKxuTHWMrbIyiiA=; b=lD40PQyFvZwU+chJpPWQwNkS2J6J/T0J3xMil3utYd2AuvhXbZ0rkx3uwBySkF3xQpZBeKuS +BSibLMC/UsfFSQycyfcmanUS44cjgDP0JRsX74Xf4uLahJmyisEsYfJXD/Yp1tCZgSnDbqd WvMF1RK0p1oM5RmMIWGPkVB1sVw= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024;d=netflix.com; h=from:to:cc:subject:date:message-id:in-reply-to:content-type:mime-version; b=bD0Mbi1/leYdMfi9z+RSJY6nqkbm6kHRLjkO2UML3SnxLDAuukY5Xrgqm78sL42DnDra0IwV awOrPcihmwKLS3t6e+OCixxaBZZaV1VyuG6wmnI5iMVyZQduE+msAchQio5ZgUS0orxBDDsq HG07iOQDaJIgn50T/0NLNHhAGMU= Received: from EXFE102.corp.netflix.com (10.64.32.162) by exout104.netflix.com (10.64.240.74) with Microsoft SMTP Server (TLS) id 14.2.298.4; Fri, 18 May 2012 13:57:19 -0700 Received: from EXMB107.corp.netflix.com ([169.254.7.134]) by exfe102.corp.netflix.com ([10.64.32.162]) with mapi id 14.02.0283.003; Fri, 18 May 2012 13:57:19 -0700 From: Jordan Zimmerman To: "user@zookeeper.apache.org" CC: "zookeeper-user@hadoop.apache.org" Subject: Re: cluster member was switched to standalone, detectable? Thread-Topic: cluster member was switched to standalone, detectable? Thread-Index: AQHNNRyYiNrJaMyFuU+WyU2j6bXbKJbP5BkAgAB5i4CAAB6JAP//izeA Date: Fri, 18 May 2012 20:57:17 +0000 Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.10.0.110310 x-originating-ip: [10.2.245.51] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 The 'srvr' command lists what mode the instance thinks it's in. Unfortunately, you have to manually parse it. If there's a quorum issue it outputs something like "This ZooKeeper is not serving requests". -JZ On 5/18/12 1:55 PM, "Adam Rosien" wrote: >Do the four-letter words tell me if a service joined the quorum correctly? >What commands and responses will tell me? > >How do I know what cluster it joined? What if nodes X & Y are in cluster A >but Z is in cluster B, should there be a cluster identifier to distinguish >membership? > >On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt wrote: > >> That would detect it, I don't think it's avoidable in the sense that >> we can't detect that type of mis-configuration and somehow handle it >> (ie stop). Your best bet would be to automate the process (and test >> that ahead of time), or bring up the new server with the client port >> set to something previously unused, then verify, then restart it with >> the client port set as it was originally. I often do this when >> debugging issues. (but that itself might cause problems wrt config >> typos). Another option is to use iptables (etc...) to turn off access >> to clients until you've verified the server joined the quorum >> correctly, then turn off the filter. >> >> Patrick >> >> On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman >> wrote: >> > ZooKeeper has a telnet style interface for periodic querying. >> > >> > You could also use Exhibitor and query it's REST API periodically. I >> > should probably add alerting to Exhibitor for this kind of thing. >> > >> > -JZ >> > >> > On 5/18/12 10:34 AM, "Adam Rosien" wrote: >> > >> >>We have a 5-member 3.3.3 cluster. One of the node's configurations was >> >>accidentally changed, and that node went into "standalone" mode, >>thinking >> >>it was a single-node cluster. However, all our zk clients still had >>the >> >>address of this server, and when connected obviously got missing or >>wrong >> >>data. >> >> >> >>Is this situation avoidable somehow? >> >> >> >>.. Adam >> > >>