hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3147) Regions stuck in transition after rolling restart, perpetual timeout handling but nothing happens
Date Mon, 25 Oct 2010 23:40:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924784#action_12924784
] 

HBase Review Board commented on HBASE-3147:
-------------------------------------------

Message from: "Jonathan Gray" <jgray@apache.org>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1087/#review1662
-----------------------------------------------------------

Ship it!


Looks good.  Not sure if I can +1 my patch but I think we should commit :)


trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
<http://review.cloudera.org/r/1087/#comment5542>

    Should we remove this code from inside of ServerShutdownHandler now?  Not a big deal but
being done twice.


- Jonathan





> Regions stuck in transition after rolling restart, perpetual timeout handling but nothing
happens
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3147
>                 URL: https://issues.apache.org/jira/browse/HBASE-3147
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.90.0
>
>
> The rolling restart script is great for bringing on the weird stuff.  On my little loaded
cluster if I run it, it horks the cluster and it doesn't recover.  I notice two issues that
need fixing:
> 1. We'll miss noticing that a server was carrying .META. and it never gets assigned --
the shutdown handlers get stuck in perpetual wait on a .META. assign that will never happen.
> 2. Perpetual cycling of the this sequence per region not succesfully assigned:
> {code}
>  2010-10-23 21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions
in transition timed out:  usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b.
state=PENDING_OPEN,                       ts=1287869814294  45154 2010-10-23 21:37:57,404
INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_OPEN or OPENING
for too long, reassigning region=usertable,user510588360,1287547556587.                  
                  7f2d92497d2d03917afd574ea2aca55b.  45155 2010-10-23 21:37:57,404 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:60000-0x2bd57d1475046a Attempting to transition node 7f2d92497d2d03917afd574ea2aca55b
from RS_ZK_REGION_OPENING to M_ZK_REGION_OFFLINE  45156 2010-10-23 21:37:57,404 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:60000-0x2bd57d1475046a Attempt to transition the unassigned node for 7f2d92497d2d03917afd574ea2aca55b
from RS_ZK_REGION_OPENING to                 M_ZK_REGION_OFFLINE failed, the node existed
but was in the state M_ZK_REGION_OFFLINE  45157 2010-10-23 21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region transitioned OPENING to OFFLINE so skipping timeout, region=usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b.
 
> ,,,
> {code}
> Timeout period again elapses an then same sequence.
> This is what I've been working on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message