hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-3068) IllegalStateException when new server comes online, is given 200 regions to open and 200th region gets timed out of regions in transition
Date Fri, 01 Oct 2010 23:03:32 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-3068:
-------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

Committing.  Tested it a couple of ways up on loaded cluster and no longer see the illegalstateexception
nor does master crash

> IllegalStateException when new server comes online, is given 200 regions to open and
200th region gets timed out of regions in transition
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3068
>                 URL: https://issues.apache.org/jira/browse/HBASE-3068
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.90.0
>
>
> Yesterday we committed a change that makes it so the master will crash is a zk transition
that is unexpected.   Its extreme but good for highlighting bad state changes (we also started
marking these as illegalstateexceptions yesterday too).
> So, testing new master I brought up a new server.  Balancer tried to give new server
256 regions.
> {code}
> 2010-10-01 16:01:42,972 INFO org.apache.hadoop.hbase.master.LoadBalancer: Calculated
a load balance in 0ms. Moving 256 regions off of 7 overloaded servers onto 1 less loaded servers
> {code}
> Turns out we failed complete open of all 256 servers within the regions-in-transition
timeout period so we tried to reassign.  The master aborted because region was in the PENDING_OPEN
state when we went about assigning.
> {code}
> 2010-10-01 16:02:28,809 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions
in transition timed out:  usertable,user1128734802,1285701924906.006696a9bf346f8593df66728e18e029.
state=PENDING_OPEN, ts=1285948921051
> 2010-10-01 16:02:28,809 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region
has been PENDING_OPEN or OPENING for too long, reassigning region=usertable,user1128734802,1285701924906.006696a9bf346f8593df66728e18e029.
> 2010-10-01 16:02:28,811 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state
trying to OFFLINE; usertable,user1128734802,1285701924906.006696a9bf346f8593df66728e18e029.
state=PENDING_OPEN, ts=1285948921051
> java.lang.IllegalStateException
>     at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:662)
>     at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:632)
>     at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:560)
>     at org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1102)
>     at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message