hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10310) ZNodeCleaner session expired for /hbase/master
Date Sat, 11 Jan 2014 01:43:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868595#comment-13868595
] 

Hudson commented on HBASE-10310:
--------------------------------

SUCCESS: Integrated in HBase-0.98 #69 (See [https://builds.apache.org/job/HBase-0.98/69/])
HBASE-10310. ZNodeCleaner session expired for /hbase/master (Samir Ahmic) (apurtell: rev 1557274)
* /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/ZNodeClearer.java


> ZNodeCleaner session expired for /hbase/master
> ----------------------------------------------
>
>                 Key: HBASE-10310
>                 URL: https://issues.apache.org/jira/browse/HBASE-10310
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.96.1.1
>         Environment: x86_64 GNU/Linux
>            Reporter: Samir Ahmic
>            Assignee: Samir Ahmic
>             Fix For: 0.98.0, 0.96.2, 0.99.0
>
>         Attachments: HBASE-10310.patch
>
>
> I was testing "hbase master clear" command while working on [HBASE-7386] here is command
and exception:
> {code}
> $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
> 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=zk1:2181
sessionTimeout=90000 watcher=clean znode for master, quorum=zk1:2181, baseZNode=/hbase
> 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process identifier=clean znode
for master connecting to ZooKeeper ensemble=zk1:2181
> 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to server zk1/172.17.33.5:2181.
Will not attempt to authenticate using SASL (Unable to locate a login configuration)
> 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to zk11/172.17.33.5:2181,
initiating session
> 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete on server
zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated timeout = 40000
> 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
> 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
> 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper,
quorum=zk1:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode
= Session expired for /hbase/master
> 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
> 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper,
quorum=zk1:2181, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode
= Session expired for /hbase/master
> 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after
1 attempts
> 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for master-0x1427a96bfea4a8a, quorum=zk1:2181,
baseZNode=/hbase Unable to get data of znode /hbase/master
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session
expired for /hbase/master
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
> 	at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
> 	at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
> 	at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> 	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
> 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for master-0x1427a96bfea4a8a,
quorum=zk1:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session
expired for /hbase/master
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
> 	at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
> 	at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
> 	at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> 	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
> 14/01/10 14:05:45 WARN zookeeper.ZooKeeperNodeTracker: Can't get or delete the master
znode
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session
expired for /hbase/master
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
> 	at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
> 	at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
> 	at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> 	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
> {code}
> After checking ZNodeCleaner.java i notice this lines :
> {code}
>  try {
>       znodeFileContent = ZNodeClearer.readMyEphemeralNodeOnDisk();
>       
>     } catch (FileNotFoundException fnfe) {
>       // If no file, just keep going -- return success.
>       LOG.warn("Can't find the znode file; presume non-fatal", fnfe);
>       return true;
>     } catch (IOException e) {
>       LOG.warn("Can't read the content of the znode file", e);
>       return false;
>     } finally {
>       zkw.close();
>     }
>     return MasterAddressTracker.deleteIfEquals(zkw, znodeFileContent);
>   }
> {code}
> Looks like we are closing zookeeper connection prematurely. After moving
> {code} return MasterAddressTracker.deleteIfEquals(zkw, znodeFileContent); {code} inside
try block issue was fixed. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message