zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: HBase dies after some time
Date Tue, 29 May 2012 21:24:39 GMT
Hmm.. due to budget constraints, I am forced to install ZooKeeper on the
same machine that runs TaskTracker.  When a big MR job starts it fires up
over 40 tasks, so as you implied this could definitely be related to memory.

Should ZooKeepers be started on their own machines?  Right now I have
ZooKeeper, HRegionServer & TaskTracker running on the same machine.  This
is a bad idea, right?  Is there any way to get ZooKeeper working under
these restrictions?

By the way, the ZooKeeper log shows this:

2012-05-29 13:56:54,842 - ERROR [CommitProcessor:2:NIOServerCnxn@445] -
Unexpected Exception:
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
        at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418)
        at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509)
        at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
        at
org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)




On Sat, May 26, 2012 at 2:28 AM, Christian Schäfer
<syrious3000@hotmail.de>wrote:

>
> Hi,
>
>  I got exactly the same behaviour and exceptions that you mention on a
> local cluster.
>
> In my case the sum of all services' heapspace was higher than the actual
> memory of the machine.
> At
>  first sum the heapspaces of your master machine likely running
> NameNode, HMaster, ZooKeeper, and maybe also, RegionServer and DataNode
> Then check that this sum is lesser than your master machines memory.
>
> Good Luck.
> Chris
>
>        Von: Something Something <mailinglists19@gmail.com>
>  An:
>  hbase-user@hadoop.apache.org; zookeeper-user@hadoop.apache.org
>  Gesendet: 3:22 Samstag, 26.Mai 2012
>  Betreff: HBase dies after some time
>
> Hello,
>
> I recently installed ZooKeeper & HBase on our dedicated Hadoop cluster on
> EC2.  The HBase stays active for some time, but after a while it dies with
> error messages similar to these:
>
> 2012-05-25 12:09:27,514 ERROR
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
> master:60000-0x5378489312c0004-0x5378489312c0004 Received unexpected
> KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
>        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>
>  at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
>        at
>
> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
>        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
> 2012-05-25 12:09:27,514 ERROR
> org.apache.hadoop.hbase.master.ActiveMasterManager:
> master:60000-0x5378489312c0004-0x5378489312c0004 Error deleting our own
> master address node
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
>
>  at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>        at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
>        at
>
> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197)
>        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310)
>
>
> This kills the HMaster as well as all HRegionServers.  Could it be that my
> ZooKeeper setup is incorrect?  Please help.  Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message