hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
Date Mon, 26 Dec 2011 11:10:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175925#comment-13175925
] 

ramkrishna.s.vasudevan commented on HBASE-5094:
-----------------------------------------------

RegionX is reassigned to RS_C during RS_A shutdown, although RegionX was just assigned to
RS_B by load balancer. So .META. table indicates RegionX is on RS_C. Both RS_B and RS_C think
they have RegionX.Later when RS_C shuts down, RegionX is reassigned to RS_B. RS_B will indicate
ALREADY_OPENED. Thus the region is considered assigned to RS_B even though .META. indicates
it is on RS_C.

1) Region RegionX - Assigned from RS_A to RS_B.
2) RS_A goes down and ServerShutDownHandler.  ServerShutDwonHandler finds RegionX with RS_A
from .META. as still .META. is not yet updated to RS_B.
3) As RS_A goes down RegionX is assigned from RS_A to RS_C.
4) RS_C goes down. ServerShutdownHandler processes RegionX and tries to assign it to RS_B.
5) RS_B says ALREADY_OPENED but .META. shows RS_C.
                
> The META can hold an entry for a region with a different server name from the one actually
in the AssignmentManager thus making the region inaccessible.
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5094
>                 URL: https://issues.apache.org/jira/browse/HBASE-5094
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: ramkrishna.s.vasudevan
>
> {code}
> RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey());
>             ServerName addressFromAM = this.services.getAssignmentManager()
>                 .getRegionServerOfRegion(e.getKey());
>             if (rit != null && !rit.isClosing() && !rit.isPendingClose())
{
>               // Skip regions that were in transition unless CLOSING or
>               // PENDING_CLOSE
>               LOG.info("Skip assigning region " + rit.toString());
>             } else if (addressFromAM != null
>                 && !addressFromAM.equals(this.serverName)) {
>               LOG.debug("Skip assigning region "
>                     + e.getKey().getRegionNameAsString()
>                     + " because it has been opened in "
>                     + addressFromAM.getServerName());
>               }
> {code}
> In ServerShutDownHandler we try to get the address in the AM.  This address is initially
null because it is not yet updated after the region was opened .i.e. the CAll back after node
deletion is not yet done in the master side.
> But removal from RIT is completed on the master side.  So this will trigger a new assignment.
> So there is a small window between the online region is actually added in to the online
list and the ServerShutdownHandler where we check the existing address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message