Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3C65CCD4 for ; Fri, 18 May 2012 20:55:45 +0000 (UTC) Received: (qmail 60864 invoked by uid 500); 18 May 2012 20:55:44 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 60817 invoked by uid 500); 18 May 2012 20:55:44 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 60684 invoked by uid 500); 18 May 2012 20:55:44 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 60680 invoked by uid 99); 18 May 2012 20:55:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 20:55:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of adam.rosien@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vb0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 May 2012 20:55:37 +0000 Received: by vbjk17 with SMTP id k17so4205665vbj.35 for ; Fri, 18 May 2012 13:55:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=s328jIOFHR7kJf7wv5sWurBNb32JiuUaOL19fa5UVYY=; b=iE+wpEUgZJmfO1iaSGZYSwAVdgqazT+mUkm4RJy3aS0MdyAepa8KB2HYAH6FtFTpJ6 CLIArpfdM3j6f7T3fGB6GnrVLAW4sGSDyM5Q3d2nVvL491nPr4Dzu0HIvZLo8sHiuDtj NL01OJ7R8xf3sHzjtgh883YUoyTV/3kmlPFxC2NfjijRB9i1T1YAKwmmiDUbqNUNGsl4 LJc5zeDkUY4F5rl0mrAWuIEzPVGtLaMtJeQilRT6liJcevM7g6rGG23ON5pqCOG+vT99 p9OghCRHLOchsWk5Fg7HRRiR3Rjs3JSyQyjRhmgdSqhB5vocl/OnTRThBTNSff/VWSr/ iMzA== MIME-Version: 1.0 Received: by 10.52.94.147 with SMTP id dc19mr4801998vdb.74.1337374516590; Fri, 18 May 2012 13:55:16 -0700 (PDT) Sender: adam.rosien@gmail.com Received: by 10.52.15.233 with HTTP; Fri, 18 May 2012 13:55:16 -0700 (PDT) In-Reply-To: References: Date: Fri, 18 May 2012 13:55:16 -0700 X-Google-Sender-Auth: usHutS7x3Yy7awn2v8S8wxyNHv4 Message-ID: Subject: Re: cluster member was switched to standalone, detectable? From: Adam Rosien To: user@zookeeper.apache.org Cc: "zookeeper-user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=bcaec5016467fe36cc04c055c40d --bcaec5016467fe36cc04c055c40d Content-Type: text/plain; charset=ISO-8859-1 Do the four-letter words tell me if a service joined the quorum correctly? What commands and responses will tell me? How do I know what cluster it joined? What if nodes X & Y are in cluster A but Z is in cluster B, should there be a cluster identifier to distinguish membership? On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt wrote: > That would detect it, I don't think it's avoidable in the sense that > we can't detect that type of mis-configuration and somehow handle it > (ie stop). Your best bet would be to automate the process (and test > that ahead of time), or bring up the new server with the client port > set to something previously unused, then verify, then restart it with > the client port set as it was originally. I often do this when > debugging issues. (but that itself might cause problems wrt config > typos). Another option is to use iptables (etc...) to turn off access > to clients until you've verified the server joined the quorum > correctly, then turn off the filter. > > Patrick > > On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman > wrote: > > ZooKeeper has a telnet style interface for periodic querying. > > > > You could also use Exhibitor and query it's REST API periodically. I > > should probably add alerting to Exhibitor for this kind of thing. > > > > -JZ > > > > On 5/18/12 10:34 AM, "Adam Rosien" wrote: > > > >>We have a 5-member 3.3.3 cluster. One of the node's configurations was > >>accidentally changed, and that node went into "standalone" mode, thinking > >>it was a single-node cluster. However, all our zk clients still had the > >>address of this server, and when connected obviously got missing or wrong > >>data. > >> > >>Is this situation avoidable somehow? > >> > >>.. Adam > > > --bcaec5016467fe36cc04c055c40d--