From user-return-11281-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Fri Mar 2 03:13:14 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9F72618064D for ; Fri, 2 Mar 2018 03:13:13 +0100 (CET) Received: (qmail 27615 invoked by uid 500); 2 Mar 2018 02:13:12 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 27581 invoked by uid 99); 2 Mar 2018 02:13:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2018 02:13:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D14A6C027F for ; Fri, 2 Mar 2018 02:13:10 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.981 X-Spam-Level: *** X-Spam-Status: No, score=3.981 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, KAM_BADIPHTTP=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=fitterweb-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id gD9sXXy2iqfI for ; Fri, 2 Mar 2018 02:13:09 +0000 (UTC) Received: from mail-ot0-f170.google.com (mail-ot0-f170.google.com [74.125.82.170]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 76F6D5F173 for ; Fri, 2 Mar 2018 02:13:09 +0000 (UTC) Received: by mail-ot0-f170.google.com with SMTP id 108so7501956otv.3 for ; Thu, 01 Mar 2018 18:13:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fitterweb-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=M9ezXjxF2crFIdidP4cjbGNfuCtE5C3nOJDyPX6HBpo=; b=FZ2V/CW2L/G3e2KaBkQap2eY1gd/AIN53Xb+n/Gqq1ogKpzstss6XRZipY/bfzFsV3 waEgzqncopggmgZ9iSCirTSW+qTNOHyvdbuYIykwfGX54eMkI8RB6BOfTb+28A5pi8rt Q1dtxL+Ld8In8n8Ih6VtEhCDt5ssnc75FaFaCJteHevEBCPzRuozl89xCowEXzGF9L8M GgPn0mTXVLOg22N40Nil/fmK4RTq1cAcbGM0fPgRDTUBWukMK8mQEZnjUEUsJkab5isO zMFm65klgt9AlXyHlItjQnnMnZmamRRaNo/p5ByUvNttYFZFRqBwxXzBtY6Xg6f3Puls JofA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=M9ezXjxF2crFIdidP4cjbGNfuCtE5C3nOJDyPX6HBpo=; b=LIKvpL+FWvhHo+PGEKuWHRO9BfapzO61GtQK+ZQ3xJOy2GBb8B2l54zSHg+Ib3QTde 6nvQvPDpCU3aURuV6MdfXCx+Gb8A5QfxAw5nOH+I35k1cWezEXN7MnS2qcrLr3QbZgl2 DuzzH/FJxMF0LWuj2bfSU201mbhIkFrWLBRt3pce477VJk84JB/VPx4EXoLmJBWdeRjv sjW5X7uWPZeh/Kov+3XfPHR9+nxhKOwJlUUQb7lppqlHqIl0WkDw49OfmlQV2MtSdViy YKJsh6kUUZ4YBYGc2fF6ueXUUqDuO10Md3/t0B1x2SssVgu+g/yiEPxrBmpRofOguyUn bvyQ== X-Gm-Message-State: AElRT7HMCqmLWw7BRnHd5jxL9pC7uQtCuEFotc+fTms9c4YiZ0M7PSSb ymInG5OfrCyeqpq7ngqSM8ugIRXUCyHlWQcKIeR+mQ== X-Google-Smtp-Source: AG47ELshX0AHcvRlvxzFt4VwnR8K3HcCLNSbosg5CaYzVJmpGkUaaE+szQTjfAEQWEIGiJ8wnfCSmHLDBAuA5897CZc= X-Received: by 10.157.29.218 with SMTP id w26mr2805637otw.24.1519956782757; Thu, 01 Mar 2018 18:13:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.157.36.3 with HTTP; Thu, 1 Mar 2018 18:13:02 -0800 (PST) In-Reply-To: References: From: Jim Keeney Date: Thu, 1 Mar 2018 21:13:02 -0500 Message-ID: Subject: Re: Ensemble fails when one node looses connectivity To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary="001a1141483a1c538d0566648444" --001a1141483a1c538d0566648444 Content-Type: text/plain; charset="UTF-8" Thanks, Yes, I have about 2MB stored in the configurations folders. I will increase the jute.maxbuffer and see if that helps. Jim K. On Thu, Mar 1, 2018 at 8:58 PM, Steph van Schalkwyk wrote: > Does the log say anything about timing out on init? > Your initLimit is already pretty big, but then we don't know anything about > your setup. > Are you storing more than 1MB in a znode? Then increase jute.maxbuffer (in > java.env as a -Djute.maxbuffer=xxxxxx). > I've recently run into that with Fusion 3.1. > Post more details, if you would. > Good luck. > Steph > > > On Thu, Mar 1, 2018 at 7:43 PM, Jim Keeney wrote: > > > I'm using Zookeeper with solr to create a cluster and I have come across > > what seems like an unexpected behavior. The cluster is setup on AWS using > > opsworks. I am using a 3 node zookeeper ensemble. The zookeeper config > > on all three nodes is: > > > > clientPort=2181 > > > > dataDir=/var/opt/zookeeper/data > > > > tickTime=2000 > > > > autopurge.purgeInterval=24 > > > > initLimit=100 > > > > syncLimit=5 > > > > server.1=172.31.86.130:2888:3888 > > > > server.2=172.31.16.234:2888:3888 > > > > server.3=172.31.73.122:2888:3888 > > > > > > Here is the issue: > > > > If one node in the ensemble fails or is shut down the ensemble carries > on. > > However, when the node is restarted it's attempt to connect to the other > > members of the cluster are rejected. The only way that I have found to > > restore the ensemble is to restart all of the nodes within a short time > > span of each other. > > > > If I do that they are able to discover each other carry on a proper > > leader election and restore order. > > > > Once they are restored everything is fine but if one of the nodes goes > > down we are faced wit the same problem. > > > > How do I ensure that if a node goes down, it can restart and rejoin the > > ensemble with out having to manually restart all the other nodes? > > > > Any help appreciated. > > > > Thanks. > > > > Jim K. > > > > > > > > > > -- > > Jim Keeney > > President, FitterWeb > > E: jim@fitterweb.com > > M: 703-568-5887 <(703)%20568-5887> > > > > *FitterWeb Consulting* > > *Are you lean and agile enough? * > > > -- Jim Keeney President, FitterWeb E: jim@fitterweb.com M: 703-568-5887 <(703)%20568-5887> *FitterWeb Consulting* *Are you lean and agile enough? * --001a1141483a1c538d0566648444--