hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Xiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a master restarts
Date Fri, 06 Dec 2013 01:11:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840772#comment-13840772

Jimmy Xiang commented on HBASE-10085:

[~jeffreyz], I thought it again and think we should not fix as in this patch. I think it is
better to fix AM#processRegionsInTransition(RegionTransition, HRegionInfo, int) so that if
the server is offline, instead of forcing offline the region, we can just restore its state
based on the EventType. If the EventType is offline, set the region to pending_open.  What
do you think?

> Some regions aren't re-assigned after a master restarts
> -------------------------------------------------------
>                 Key: HBASE-10085
>                 URL: https://issues.apache.org/jira/browse/HBASE-10085
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 0.96.1
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>             Fix For: 0.98.0, 0.96.1
>         Attachments: hbase-10085.patch
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no Region servers
are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline again and
SSH skip re-assigning them by function AM.processServerShutdown as shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  [master:h2-ubuntu12-sec-1386048659-hbase-8:60000] master.AssignmentManager:
Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG [master:h2-ubuntu12-sec-1386048659-hbase-8:60000] master.AssignmentManager:
RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on deadserver; forcing
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force region state
offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:60000-3]
master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected {873dbd8c269f44d0aefb0f66c5b53537
state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}

> {code}

This message was sent by Atlassian JIRA

View raw message