Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 551F59511 for ; Fri, 17 Feb 2012 01:08:50 +0000 (UTC) Received: (qmail 67757 invoked by uid 500); 17 Feb 2012 01:08:48 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 67728 invoked by uid 500); 17 Feb 2012 01:08:48 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Delivered-To: moderator for user@hbase.apache.org Received: (qmail 57057 invoked by uid 99); 17 Feb 2012 01:03:59 -0000 User-Agent: Microsoft-MacOutlook/14.12.0.110505 Date: Thu, 16 Feb 2012 17:03:49 -0800 Subject: unstable secure zookeeper From: Francis Liu To: Message-ID: Thread-Topic: unstable secure zookeeper In-Reply-To: <57331093.61329408111745.JavaMail.hudson@aegis> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Hi, I have 0.92-security installed. I'm hitting intermittent problems starting the regionservers because of intermittent zookeeper connection failures. Because of this not all my region servers startup after "start regionservers". This also sometimes happens on the master server. On the regionserver the error would look like: 2012-02-16 02:57:28,086 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server *snip*,60020,1329361047462: Unexpected exception during initialization, aborting org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase/shutdown at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZ ooKeeper.java:295) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeT racker.java:77) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HReg ionServer.java:561) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitializ ation(HRegionServer.java:524) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:6 25) at java.lang.Thread.run(Thread.java:619) If the server does successfully startup things run fine. Prior to this I had .92 without security running fine. Any ideas on what could be causing this? -Francis