Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F90910784 for ; Wed, 30 Apr 2014 08:10:00 +0000 (UTC) Received: (qmail 50232 invoked by uid 500); 30 Apr 2014 08:09:57 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 49593 invoked by uid 500); 30 Apr 2014 08:09:56 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 48671 invoked by uid 99); 30 Apr 2014 08:09:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Apr 2014 08:09:53 +0000 X-ASF-Spam-Status: No, hits=2.6 required=5.0 tests=FORGED_YAHOO_RCVD,RCVD_IN_DNSWL_NONE,SPF_PASS,SUBJ_ALL_CAPS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of fpjunqueira@yahoo.com designates 212.82.96.87 as permitted sender) Received: from [212.82.96.87] (HELO nm2-vm7.bullet.mail.ir2.yahoo.com) (212.82.96.87) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Apr 2014 08:09:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=gcom1024; t=1398845364; bh=tlMsBNGvU/tz0/mI8vBOTdY0dnO9nEa0895tdDwKz5Q=; h=Received:Received:Received:DKIM-Signature:X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:From:To:References:In-Reply-To:Subject:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:X-Mailer:Thread-Index:Content-Language; b=CI++aaMdaS8CiCNMl8RyI4Shy/Au+3OSos6lymTY+bv+uHuh2QNBe+G8IIjbNrSRbZ7lpJNnSshhZuKt485mdsLzg63MOCr/wWEk6ATMc4YB4VMhB5e60/IDOg1e2pgNk6wH0dm1O5Q5Rxg2JuOF9q3+GEQtBOpQpnHcdqBdLDQ= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=gcom1024; d=yahoo.com; b=VdDPHeYelx9k+GD8Er67ak2bW5SV18VR0cuo8QkBSyvMZtaGmMviWzOcDYN9vY34nqGksvUr4ldpgCuTEhTXbfBl+qne7Bh730RSwOWHI6OPhcVVycm4Jb5bFFAlhlC+boddOlxRCYchWerOC89PxtMQXoa2fQ6SiXIx1wkInXk=; Received: from [212.82.98.53] by nm2.bullet.mail.ir2.yahoo.com with NNFMP; 30 Apr 2014 08:09:24 -0000 Received: from [46.228.39.108] by tm6.bullet.mail.ir2.yahoo.com with NNFMP; 30 Apr 2014 08:09:24 -0000 Received: from [127.0.0.1] by smtp145.mail.ir2.yahoo.com with NNFMP; 30 Apr 2014 08:09:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1398845364; bh=tlMsBNGvU/tz0/mI8vBOTdY0dnO9nEa0895tdDwKz5Q=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:From:To:References:In-Reply-To:Subject:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:X-Mailer:Thread-Index:Content-Language; b=1hh2kySJw+XKFcsdppzAGOmWYGeam1QN/bJSNLA0TjxtBYcliXEBHQXWCTaNoeZUg6mAQzDPMI2OrOpi8KLJpHl2YVKDZCbGaBANtDVaePXeZ1Dpv0GijhgHQIeSS9Ymj1O7OQd7OmD/uzI7vrWt7YsTKs9QGbzQO8OH5BlO8xY= X-Yahoo-Newman-Id: 785257.35464.bm@smtp145.mail.ir2.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: HgPJ6oAVM1lGLnSHe_cIBmA5oEEwon67cDH2bI3XxmS83nf fWRwZHHnGXTz3LdSUtjDl9aYH4hQwIhBKGL7bvl5PU2_H.uP9gz.dHqVo.m_ LLXaecl8fs0zeJLqXQgIsvGPMv9bQWLhBv4i2guo1w4rCd1z7txPh2N.ju5P 712K6SnKQ_uymYiKkFNS5Q4kJOsNa0un0aSNfMb_l5bRscr4rQiw_VxRwhmt LHu_l1wo35FjWMFLmRaDY8x374a.TvkroVXH4_acnsjd6iiDzlNcVyfYmZN8 OaZLUtImqka.Pd1QNGnwnx3.drsYEXK8ih_7LjYL6CR6vJu0tOCWES.iZ6if EMzB5LqcJtb7jbIq6x3_ucWFhHuLpzEtzTl8R1pcQWMhXwWM7VmVkX3pz0tl 550vVGR1d0F_AivbxKjTOfKU3uycMdz_dELROgCEwFg45NNDhuasCtwz4wbz mSNyx6EAS0vc_MnkdCyed.ScIc_WmUO3HhdifGGN5WMTKQdygaxNr30OX1Fu g9ieT_EDSmPEg0A_1Kb3pvw-- X-Yahoo-SMTP: HT5UJDeswBACWJPOeualxAa.da..S.fl X-Rocket-Received: from MSRCfpj (fpjunqueira@94.245.87.231 with plain [188.125.69.59]) by smtp145.mail.ir2.yahoo.com with SMTP; 30 Apr 2014 08:09:24 +0000 UTC From: "FPJ" To: , References: In-Reply-To: Subject: RE: ZOOKEEPER-900 / 901 / 1678 Date: Wed, 30 Apr 2014 09:09:23 +0100 Message-ID: <024201cf644b$7fcbc820$7f635860$@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 15.0 Thread-Index: AQFiS1F8F8SXkl+0byLQ4LASZnmzyJwD/9lg Content-Language: en-gb X-Virus-Checked: Checked by ClamAV on apache.org Hi Cameron, Which version of ZK are you using? Also, if you can share logs, then it = might be easier for us to help you out. -Flavio > -----Original Message----- > From: Cameron McKenzie [mailto:mckenzie.cam@gmail.com] > Sent: 30 April 2014 08:44 > To: zookeeper-user@hadoop.apache.org > Subject: ZOOKEEPER-900 / 901 / 1678 >=20 > ZooKeeper users, > Does anyone know the status of these issues? They don't seem to have = had > anything done to them since late 2010? >=20 > I think that we're experiencing the same issue currently. If we have a = 3 node > cluster for example, and 1 of these nodes is completely dead (i.e the = entire > host is not contactable due to a power outage), I would expect that a > quorum could still be formed, but this does not appear to be the case. >=20 > I haven't delved into the code too much, but it appears that blocking = IO is > being used for the connect. This doesn't respect the socket SO timeout = being > set, so it means that the connect() call can block for some arbitrary = amount of > time (based on the OS level TCP settings?). This in turn means that = leader > election will fail because it times out before the socket connect = does, even > though there are enough live hosts present to form a quorum. >=20 > This seems like a fairly fundamental problem, unless I'm missing = something. > If a single host goes down due to a power failure for example, it can = prevent > any further hosts joining the cluster. In addition, if after a power = failure, > enough hosts come back online to form a quorum, but some don't, that a > quorum may still not be able to be formed. > cheers > Cam