hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6625) If we have hundreds of thousands of regions getChildren will encouter zk exception
Date Wed, 22 Aug 2012 23:54:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439955#comment-13439955
] 

Jonathan Hsieh commented on HBASE-6625:
---------------------------------------

I slightly misspoke in my previous comment -- the exception is in the assignment manager so
this would be 100k regions per cluster.  So its an exabyte of data for the hbase cluster (not
per region server).  Still, large enough to last for at least a year or two. :)
                
> If we have hundreds of thousands of  regions getChildren will encouter zk exception
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-6625
>                 URL: https://issues.apache.org/jira/browse/HBASE-6625
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zhou wenjian
>            Assignee: Zhou wenjian
>
> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$ExistsUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x05\xB3\x06 
> g\xE8r\xBB]\x09\xCF,1336724029944.079cb2f8a375e66fa089291b82f2a03f. state=OFFLINE, ts=1336909053108

> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x08s\x84\x8 
> 8$7\xB1\xC4\xFCg,1336724030660.76c07780231942231013c7feb5e5eb14. state=OFFLINE, ts=1336909055089,
server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rs=CreateNewTableWith100000Regions,\x08s\x89\xC 
> B\x9B\xF0\xE4\xCA\x97\xB0,1336724030660.fa38b9d8367387a64a327087cb43b3e0. state=OFFLINE,
ts=1336909055089, server=dw76.kgb.sqa.cm4,60020,1336908983944 
> 2012-05-13 19:37:37,528 INFO org.apache.hadoop.hbase.master.AssignmentManager: dw76.kgb.sqa.cm4,60020,1336908983944
unassigned znodes=58464 of total=120002 
> 2012-05-13 19:37:37,758 WARN org.apache.zookeeper.ClientCnxn: Session 0x13745fc2c8d0001
for server dw51.kgb.sqa.cm4/10.232.98.51:2180, unexpected error, clos 
> ing socket connection and attempting reconnect 
> java.io.IOException: Packet len4320092 is out of range! 
>         at org.apache.zookeeper.ClientCnxn$SendThread.readLength(ClientCnxn.java:710)

>         at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:869) 
>         at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) 
> 2012-05-13 19:37:37,860 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x13745fc2c8d0001
Unable to list children of znode /hbase-new4/unassigned 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x13745fc2c8d0001
Received unexpected KeeperException, re-thro 
> wing exception 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception
reading unassigned children 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase-new4/unassigned 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) 
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) 
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1243) 
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:302)

>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndGetNewChildren(ZKUtil.java:413)

>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeChildrenChanged(AssignmentManager.java:759)

>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:314)

>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)

>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) 
> 2012-05-13 19:37:37,861 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message