hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
Date Wed, 04 Jan 2012 18:47:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179738#comment-13179738
] 

ramkrishna.s.vasudevan commented on HBASE-5094:
-----------------------------------------------

@Stack
I tried the option of applying the fix on the master side.  But i felt like the history of
the region was not available.  (May be am wrong.)
Also the ServerShutdownhandler.processShutDown() removes the online server first and also
the list of regions.

Then it is after the region opening is done by the balancer flow the new server name is again
added back.
{code}
if (isServerOnline(sn)) {
        this.regions.put(regionInfo, sn);
        addToServers(sn, regionInfo);
        this.regions.notifyAll();
{code}
bq.Could we make it such that only one thread can transition a region at a time?
This i did not think much on this.  

@Ming Ma
If it is ok, do you mind sharing the patch that you had prepared ?

                
> The META can hold an entry for a region with a different server name from the one actually
in the AssignmentManager thus making the region inaccessible.
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5094
>                 URL: https://issues.apache.org/jira/browse/HBASE-5094
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>         Attachments: HBASE-5094_1.patch
>
>
> {code}
> RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey());
>             ServerName addressFromAM = this.services.getAssignmentManager()
>                 .getRegionServerOfRegion(e.getKey());
>             if (rit != null && !rit.isClosing() && !rit.isPendingClose())
{
>               // Skip regions that were in transition unless CLOSING or
>               // PENDING_CLOSE
>               LOG.info("Skip assigning region " + rit.toString());
>             } else if (addressFromAM != null
>                 && !addressFromAM.equals(this.serverName)) {
>               LOG.debug("Skip assigning region "
>                     + e.getKey().getRegionNameAsString()
>                     + " because it has been opened in "
>                     + addressFromAM.getServerName());
>               }
> {code}
> In ServerShutDownHandler we try to get the address in the AM.  This address is initially
null because it is not yet updated after the region was opened .i.e. the CAll back after node
deletion is not yet done in the master side.
> But removal from RIT is completed on the master side.  So this will trigger a new assignment.
> So there is a small window between the online region is actually added in to the online
list and the ServerShutdownHandler where we check the existing address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message