Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2DFBB9758 for ; Wed, 14 Mar 2012 17:02:11 +0000 (UTC) Received: (qmail 58267 invoked by uid 500); 14 Mar 2012 17:02:10 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 58211 invoked by uid 500); 14 Mar 2012 17:02:10 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 58203 invoked by uid 99); 14 Mar 2012 17:02:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2012 17:02:10 +0000 X-ASF-Spam-Status: No, hits=0.2 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of christian.ziech@nokia.com designates 147.243.128.26 as permitted sender) Received: from [147.243.128.26] (HELO mgw-da02.nokia.com) (147.243.128.26) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2012 17:02:04 +0000 Received: from vaebh102.NOE.Nokia.com (vaebh102.europe.nokia.com [10.160.244.23]) by mgw-da02.nokia.com (Switch-3.4.4/Switch-3.4.4) with ESMTP id q2EH0oI5023093 for ; Wed, 14 Mar 2012 19:01:42 +0200 Received: from smtp.mgd.nokia.com ([65.54.30.60]) by vaebh102.NOE.Nokia.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Wed, 14 Mar 2012 19:01:39 +0200 Received: from [172.25.62.77] (172.25.62.77) by mail.nokia.com (65.54.30.60) with Microsoft SMTP Server id 14.1.355.3; Wed, 14 Mar 2012 18:01:39 +0100 Message-ID: <4F60CED2.5080601@nokia.com> Date: Wed, 14 Mar 2012 18:01:06 +0100 From: Christian Ziech User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.27) Gecko/20120216 Thunderbird/3.1.19 MIME-Version: 1.0 To: Subject: Re: Zookeeper on short lived VMs and ZOOKEEPER-107 References: <4F60C181.90802@nokia.com> In-Reply-To: <4F60C181.90802@nokia.com> Content-Type: multipart/alternative; boundary="------------070204030607000506010201" X-Originating-IP: [172.25.62.77] X-OriginalArrivalTime: 14 Mar 2012 17:01:39.0799 (UTC) FILETIME=[1FAA4670:01CD0204] X-Nokia-AV: Clean X-Virus-Checked: Checked by ClamAV on apache.org --------------070204030607000506010201 Content-Type: text/plain; charset="ISO-8859-15"; format=flowed Content-Transfer-Encoding: 8bit Just a small addition: In my opinion the patch could really boil down to add a quorumServer.electionAddr = new InetSocketAddress(electionAddr.getHostName(), electionAddr.getPort()); in the catch(IOException e) clause of the connectOne() method of the QuorumCnxManager. In addition on should perhaps make the electionAddr field in the QuorumPeer.QuorumServer class volatile to prevent races. I haven't checked this change yet fully for implications but doing a quick test on some machines at least showed it would solve our use case. What do the more expert users / maintainers think - is it even worthwhile to go that route? Am 14.03.2012 17:04, schrieb ext Christian Ziech: > LEt me describe our upcoming use case in a few words: We are planning > to use zookeeper in a cloud were typically nodes come and go > unpredictably. We could ensure that we always have a more or less > fixed quorum of zookeeper servers with a fixed set of host names. > However the IPs associated with the host names would change every time > a new server comes up. I browsed the code a little and it seems right > now that the only problem is that the zookeeper server is remembering > the resolved InetSocketAddress in its QuorumPeer hash map. > > I saw that possibly ZOOKEEPER-107 would also solve that problem but > possibly in a more generic way than actually needed (perhaps here I > underestimate the impact of joining as a server with an empty data > directory to replace a server that previously had one). > > Given that - from looking at ZOOKEEPER-107 - it seems that it will > still take some time for the proposed fix to make it into a release, > would it make sense to invest time into a smaller work fix just for > this "replacing a dropped server without rolling restarts" use case? > Would there be a chance that a fix for this makes it into the 3.4.x > branch? > > Are there perhaps other ways to get this use case supported without > the need for doing rolling restarts whenever we need to replace one of > the zookeeper servers? > -- *NOKIA* *Christian Ziech* Senior Software Developer Context Based Services Services & Software Mobile: +4915155155740 Fax: +493044676555 eMail: christian.ziech@nokia.com Nokia gate5 GmbH Invalidenstr. 117 10115 Berlin, Germany www.maps.nokia.com www.smart2go.com Nokia gate5 GmbH, Sitz der Gesellschaft: Berlin, Amtsgericht Charlottenburg: HRB 106443 B, Steuernr.: 37/222/20817, ID/VAT-Nr.: DE 812 845 193, Gesch�ftsf�hrer: Dr. Michael Halbherr, Karim T�htivuori --------------070204030607000506010201--