hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Iancu <daniel.ia...@1and1.ro>
Subject Hbase 0.90.1 master crashes regularly
Date Fri, 04 Mar 2011 14:39:21 GMT
Hi
I've updated our dev environment from Hbase 0.90.0 (ASF+CDH3b3) which 
behaved very stable to Hbase 0.90.1 (CDH3B4) and since then the HMaster 
dies regularly. Issue seems to be regarded to the connection to 
Zookeeper. Even if I use a standby HMaster, this one also dies from same 
cause:


2011-03-04 15:05:54,699 FATAL 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Unexpected exception handling nodeDeleted event
org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /hbase/master
     at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
     at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
     at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:232)
     at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:165)
     at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261)
     at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x22e80dcc2350001 closed
2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ClientCnxn: 
EventThread shut down
2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x22e80dcc2350000 closed
2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ClientCnxn: 
EventThread shut down
2011-03-04 15:05:54,718 INFO org.apache.hadoop.hbase.master.HMaster: 
HMaster main thread exiting


just before this one there is an other exception

2011-03-04 15:07:00,611 FATAL org.apache.hadoop.hbase.master.HMaster: 
Failed assignment of regions to serverName=desktop,60020,1299242075991, 
load=(requests=0, regions=0, usedHeap=34, maxHeap=996)
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting 
up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to 
/172.28.124.148:60020 after attempts=1
     at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
     at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
     at 
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606)
     at 
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:560)
     at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:776)
     at 
org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1310)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.ConnectException: Connection refused
     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
     at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
     at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
     at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
     at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
     at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
     at 
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
     at $Proxy6.getProtocolVersion(Unknown Source)
     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
     at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
     at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
     ... 8 more
2011-03-04 15:07:00,615 INFO org.apache.hadoop.hbase.master.HMaster: 
Aborting

Any hint for me what could be wrong there?

Thanks
Daniel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message