hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bijieshan <bijies...@huawei.com>
Subject Rs: Does it necessarily to handle the "Zookeeper.ConnectionLossException" in ZKUtil.getDataAndWatch?
Date Wed, 20 Apr 2011 01:39:39 GMT
Thanks J-D.
I have learned that there's several possibilities can lead to ConnectionLossException, like
FullGC, heavily swap space, or IO waits reasons.
Especially about the IO waits reasons, does any good suggestions you can provide about the
networking mode? In my current env, I put the Zookeeper, hdfs, hbase in the same machine,
any problems about that?

Regards,
Jeason Bean

-----邮件原件-----
发件人: jdcryans@gmail.com [mailto:jdcryans@gmail.com] 代表 Jean-Daniel Cryans
发送时间: 2011年4月19日 1:14
收件人: user@hbase.apache.org
主题: Re: Does it necessarily to handle the "Zookeeper.ConnectionLossException" in ZKUtil.getDataAndWatch?

Take a look at the zookeeper server log, it should give you a clue. If
it says there's too many connections, then you're hitting a well known
problem with HBase 0.90, just look for the other threads in this
mailing list about that.

J-D

On Sat, Apr 16, 2011 at 3:01 AM, bijieshan <bijieshan@huawei.com> wrote:
> Thanks for Jean-Daniel Cryans's reply.
> I have refered to the issue of HBASE-3065.And it's indeed the same problem.
> Liyin Tang has given a resolvent to this issue . When the ConnectionLossException happened,
take some retries to re-connetct to the ZK server.
> Maybe it can be reconnect successfully with high probability, but not always.
> In my scenario:
> 1. The ConnectionLossException happened.
> 2. The Hmaster process aborted due to session got expired.
> 3. When I restart the Hmaster process, the ConnectionLossException was happened again.
So the initialization failed, and the Hmaster aborted again.
>
> My question is under what conditions does the ConnectionLossException happened? I know
the network reasons can cause this problem. Does any other possibilities exists?
> Thanks!
>
> Jieshan Bean
>
> ===================================================================================================================
> -----邮件原件-----
> 发件人: jdcryans@gmail.com [mailto:jdcryans@gmail.com] 代表 Jean-Daniel Cryans
> 发送时间: 2011年4月15日 2:27
> 收件人: user@hbase.apache.org
> 主题: Re: Does it necessarily to handle the "Zookeeper.ConnectionLossException" in
ZKUtil.getDataAndWatch?
>
> I guess we should, there's
> https://issues.apache.org/jira/browse/HBASE-3065 that's open, but in
> your case like I mentioned in your other email there seems to be
> something weird in your environment.
>
> J-D
>
> On Thu, Apr 14, 2011 at 12:51 AM, bijieshan <bijieshan@huawei.com> wrote:
>> Hi,
>> The "KeeperException$ConnectionLossException" exception occurred while the cluster
is running, as we know, it's a Zookeeper "recoverable" exception(And this exception has been
handled in the method of ZooKeeperWatcher.ZooKeeperWatcher),and the suggestion is that we
should retry a while. Does it necessarily?
>>
>> Here is the exception logs:
>>
>> 2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22e8e6ee15f0046
Unable to get data of znode /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
>> 2011-03-21 13:26:53,137 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
master:60000-0x22e8e6ee15f0046 Received unexpected KeeperException, re-throwing exception
>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
>>
>> Expecting for the reply!
>> Thank you.
>>
>> Regards,
>> Jeason Bean
>>
>>
>
Mime
View raw message