Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E117217C3D for ; Fri, 11 Sep 2015 12:28:32 +0000 (UTC) Received: (qmail 65764 invoked by uid 500); 11 Sep 2015 12:28:32 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 65721 invoked by uid 500); 11 Sep 2015 12:28:32 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 65710 invoked by uid 99); 11 Sep 2015 12:28:32 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Sep 2015 12:28:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 08367C01AB for ; Fri, 11 Sep 2015 12:28:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.898 X-Spam-Level: ** X-Spam-Status: No, score=2.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id QYEDtL_lk7nh for ; Fri, 11 Sep 2015 12:28:30 +0000 (UTC) Received: from mail-ig0-f172.google.com (mail-ig0-f172.google.com [209.85.213.172]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 7227820103 for ; Fri, 11 Sep 2015 12:28:30 +0000 (UTC) Received: by igcpb10 with SMTP id pb10so43422100igc.1 for ; Fri, 11 Sep 2015 05:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=GbHERQZ2syCPk14ccRvU/2H67lvQcUnaYr1aict8TGo=; b=eNdyRdn6E6NXScRam/F+1xN4uh3wfKMxrpytTacpHVEJubocklOqhLeQ8JJEMZWX84 HvixEYTvQWmF39lcAPsmajsDMwxusozUAGgoF+dTwjdAQb5ayfngP0mI6ungZUhx7PAf IZl1MDxZIXrwX7pQASx4Z1VuKIVqNp6KvHKrgwFuw7UP30r8Q5ToejveMz1LkoO94oqH 4u8Q+XHyA7cxblzKED/y8d/jGak4ewK8vQz2RYsqmDPLNQmSPrNhYuIwjsEA9og/0xoS fwT/CLFnVYPjb0pzj275l3t3Y75zZU3S44lK+S3cqtfJcRTdBKRs8EifjNXxKv5HIQpj 1mNA== MIME-Version: 1.0 X-Received: by 10.50.66.5 with SMTP id b5mr3393327igt.84.1441974509918; Fri, 11 Sep 2015 05:28:29 -0700 (PDT) Received: by 10.64.32.37 with HTTP; Fri, 11 Sep 2015 05:28:29 -0700 (PDT) Date: Fri, 11 Sep 2015 08:28:29 -0400 Message-ID: Subject: Client fails due to single Zookeeper node failure From: Brendan Mahoney To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=001a1134d1fe7061bd051f77dab8 --001a1134d1fe7061bd051f77dab8 Content-Type: text/plain; charset=UTF-8 Hi, We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5 cluster. One of the Zookeeper nodes crashed and all Accumulo client connections (including the shell) now fail with: ERROR: java.lang.RuntimeException: Failed to connect to zookeeper (node_11:2181) within 2x zookeeper timeout period 30000. If we move the bad Zookeeper node (node_11) to the end of the Zookeeper node list in accumulo-site.xml, clients connect successfully. Is the first Zookeeper node in the list a single-point-of-failure for our Accumulo cluster? Thanks, Brendan --001a1134d1fe7061bd051f77dab8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
=C2=A0 We have an Accumulo v1.6.1 cluster with a 5-node Zookeeper v3.4.5=20 cluster.=C2=A0=C2=A0 One of the Zookeeper nodes crashed and all Accumulo cl= ient=20 connections (including the shell) now fail with:

ERROR: j= ava.lang.RuntimeException: Failed to connect to zookeeper (node_11:2181) wi= thin 2x zookeeper timeout period 30000.

=C2=A0If we move the bad Zookeeper node (node_11) to the end of the Zookeeper=20 node list in accumulo-site.xml, clients connect successfully.=C2=A0 Is the = first Zookeeper node in the list a=20 single-point-of-failure for our Accumulo cluster?

Thanks,=
=C2=A0=C2=A0 Brendan
--001a1134d1fe7061bd051f77dab8--