hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-6625) If we have hundreds of thousands of regions getChildren will encouter zk exception
Date Sat, 19 Jan 2013 23:34:12 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HBASE-6625.
--------------------------

    Resolution: Duplicate

Resolving as a dup of hbase-4246.  We also have a comment on workaround in here http://hbase.apache.org/book.html#trouble.master
 Jon's suggestion that we bound the maximum number of regions on a cluster is a good one.
 Lets do that in another issue.
                
> If we have hundreds of thousands of  regions getChildren will encouter zk exception
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-6625
>                 URL: https://issues.apache.org/jira/browse/HBASE-6625
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhou wenjian
>            Assignee: Zhou wenjian
>
> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$ExistsUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x05\xB3\x06 
> g\xE8r\xBB]\x09\xCF,1336724029944.079cb2f8a375e66fa089291b82f2a03f. state=OFFLINE, ts=1336909053108

> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x08s\x84\x8 
> 8$7\xB1\xC4\xFCg,1336724030660.76c07780231942231013c7feb5e5eb14. state=OFFLINE, ts=1336909055089,
server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x08s\x89\xC 
> B\x9B\xF0\xE4\xCA\x97\xB0,1336724030660.fa38b9d8367387a64a327087cb43b3e0. state=OFFLINE,
ts=1336909055089, server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 INFO org.apache.hadoop.hbase.master.AssignmentManager: dw76.kgb.sqa.cm4,60020,1336908983944
unassigned znodes=58464 of total=120002 
> 2012-05-13 19:37:37,758 WARN org.apache.zookeeper.ClientCnxn: Session 0x13745fc2c8d0001
for server dw51.kgb.sqa.cm4/10.232.98.51:2180, unexpected error, clos 
> ing socket connection and attempting reconnect 
> java.io.IOException: Packet len4320092 is out of range! 
>         at org.apache.zookeeper.ClientCnxn$SendThread.readLength(ClientCnxn.java:710)

>         at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:869) 
>         at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) 
> 2012-05-13 19:37:37,860 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x13745fc2c8d0001
Unable to list children of znode /hbase-new4/unassigned 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x13745fc2c8d0001
Received unexpected KeeperException, re-thro 
> wing exception 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception
reading unassigned children 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message