hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Zookeeper: KeeperErrorCode NoNode for /hbase/backup-masters
Date Thu, 09 Aug 2012 22:30:05 GMT
I'm not familiar with Zookeeper snapshot recovery errors, in fact I
don't think I've ever seen one, but looking over your hbase-site.xml I
see that you didn't change where ZK is storing its data so it means it
goes to /tmp. I guess it wouldn't be a stretch to say that some files
are gone and now your ZK data is in an inconsistent state.

The name of the configuration is kinda buried in the doc, it's here:
http://hbase.apache.org/book.html#zookeeper

Look for hbase.zookeeper.property.dataDir

Once that's changed, ZK should start normally from the new ZK data folder.

J-D

On Thu, Aug 9, 2012 at 2:05 AM, Kristoffer Sjögren <stoffe@gmail.com> wrote:
> Hi all
>
> I have a problem starting hbase in a fully distributed 3 machine setup (2
> datanodes/regionservers + 1 master/namenode). For some reason zookeeper on
> master complains about not finding /hbase/backup-masters in
> hbase-user-zookeeper-host.out.
>
> java.io.IOException: Failed to process transaction type: 1 error:
> KeeperErrorCode = NoNode for /hbase/backup-masters
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
>     at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
>     at
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
>     at
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
>     at
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
>     at
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
>     at
> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.runZKServer(HQuorumPeer.java:78)
>     at
> org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:63)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /hbase/backup-masters
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:209)
>     at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:149)
>
> hbase-user-zookeeper-host.log further complains about not finding the
> transaction log.
>
> 2012-08-09 08:46:30,924 ERROR
> org.apache.zookeeper.server.persistence.FileTxnSnapLog: Parent
> /hbase/backup-masters missing for
> /hbase/backup-masters/vhp11.aphelion.se,60000,1343122915296
>
> This prevents master and regionservers to form quorum on master's port 2181:
>
> 2012-08-09 08:46:48,807 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> connection to server vhp11.aphelion.se/192.168.1.250:2181
> 2012-08-09 08:46:48,808 WARN
> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
> java.lang.SecurityException: Unable to locate a login configuration occurred
> when trying to find JAAS configuration.
> 2012-08-09 08:46:48,808 INFO
> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
> SASL-authenticate because the default JAAS configuration section 'Client'
> could not be found. If you are not using SASL, you may ignore this. On the
> other hand, if you expected SASL to work, please fix your JAAS
> configuration.
> 2012-08-09 08:46:48,808 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect
> java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>     at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
> 2012-08-09 08:46:48,910 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
> 2012-08-09 08:46:48,911 ERROR
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists
> failed after 3 retries
> 2012-08-09 08:46:48,911 ERROR
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: Failed construction of Master: class
> org.apache.hadoop.hbase.master.HMaster
>     at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1623)
>     at
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:144)
>     at
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1637)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049)
>     at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:189)
>     at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:892)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
>     at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:154)
>     at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:274)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>     at
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1618)
>     ... 5 more
>
> Not really sure how to proceed. Thankful for any help or pointers.
>
> Configuration and logs files attached.
>
> Cheers,
> -Kristoffer
>
>

Mime
View raw message