hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Zookeeper exception leading to the shutdown of HBase
Date Thu, 14 Apr 2011 18:25:06 GMT
In the first case there clearly is a pause of 13 seconds, and in the
second case it talks of a 60 secs lapse of time when the master's
zookeeper client wasn't able to talk to the zookeeper server. As far
as I can tell there's something weird going on in your environment
(network issues maybe?).

J-D

On Thu, Apr 14, 2011 at 12:51 AM, bijieshan <bijieshan@huawei.com> wrote:
> Hi,
>   I found this problem when the HBase cluster was running,here the logs information:
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2011-03-21 13:26:39,697 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection
to server t1/157.5.111.11:2181
> 2011-03-21 13:26:39,698 INFO org.apache.zookeeper.ClientCnxn: Socket connection established
to t1/157.5.111.11:2181, initiating session
> 2011-03-21 13:26:53,035 INFO org.apache.zookeeper.ClientCnxn: Client session timed out,
have not heard from server in 13336ms for sessionid 0x22e8e6ee15f0046, closing socket connection
and attempting reconnect
> 2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22e8e6ee15f0046
Unable to get data of znode /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,137 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x22e8e6ee15f0046
Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception
reading unassigned node data
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> When I restart the cluster,the problem is still exist(Due to the unnormally Zookeeper
process):
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 11-03-21 14:43:27,565 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection,
connectString=t2:2181,t1:2181,t0:2181 sessionTimeout=180000 watcher=master:60000
> 2011-03-21 14:43:27,573 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection
to server t1/157.5.111.11:2181
> 2011-03-21 14:43:27,582 INFO org.apache.zookeeper.ClientCnxn: Socket connection established
to t1/157.5.111.11:2181, initiating session
> 2011-03-21 14:44:27,586 INFO org.apache.zookeeper.ClientCnxn: Client session timed out,
have not heard from server in 60003ms for sessionid 0x0, closing socket connection and attempting
reconnect
> 2011-03-21 14:44:27,699 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed
to start master
> java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
>         at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1071)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1085)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
= ConnectionLoss for /hbase
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:648)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>         at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:219)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1066)
>         ... 5 more
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> This problem is most similar to the phenomenon described in the issue of:
> https://issues.apache.org/jira/browse/HBASE-3062
> And the bug has been fixed in the version of HBase 0.90.1.
> Please help to analysis the problem.Thank you.
> Expecting to the response.
>
> Regards,
> Jieshan
>
>

Mime
View raw message