hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
Date Fri, 09 Sep 2011 13:32:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101200#comment-13101200

ramkrishna.s.vasudevan commented on HBASE-4153:

After HBASE-4015 these are the following changes in my previous observation and 
pls note that as part of this JIRA the fix will be once we get RegionAlreadyInTransition I
will not be moving the memory state to OFFLINE
-> Open Open
Here if the first open region is in progress
a) before transition OFFLINE->OPENING or OPENING->OPENED
The second open region call will set the data to OFFLINE and there will be a version mismatch
when the first RS tries to transit to OPENING and hence the first open region call will fail.
So the second open region call will get RegionAlreadyInTransition and its upto the TimeOutMonitor
to now open the region as it finds the RIT in PENDING_OPEN
b) After transition to OPENED
By not moving the inmemory state to OFFLINE on RegionAlreadyIntransition, once a call back
comes for OPENED node to Master we can delete the inmemory state  (this is already happening)
of PENDING_OPEN due to second open region

If we leave memory state in OFFLINE as per current behaviour 
          if (regionState == null ||
              (!regionState.isPendingOpen() && !regionState.isOpening())) {
            LOG.warn("Received OPENED for region " +
                prettyPrintedRegionName +
                " from server " + data.getOrigin() + " but region was in " +
                " the state " + regionState + " and not " +
                "in expected PENDING_OPEN or OPENING states");
{code} . 
This is the major problem i see.

-> Close Open
As per my previous analysis
a) before transition from CLOSING to CLOSED
when an open call arrives while close region is in progress, 
    try {
      if (ZKAssign.transitionNodeClosed(server.getZooKeeper(), regionInfo,
          server.getServerName(), expectedVersion) == FAILED) {
        LOG.warn("Completed the CLOSE of a region but when transitioning from " +
            " CLOSING to CLOSED got a version mismatch, someone else clashed " +
            "so now unassigning");
the region will be closed in RS side but the RIT in master will be in PENDING_OPEN due to
regionalready in transtition which again the timeoutmonitor will take care of opening the
b) after setting the node to CLOSED state 
here once again the assign call will happen as part of CloseRegionProcessing and if a parallel
new open region arrives it goes back to Open Open state as described previously.

Pls note that in all cases manually through admin assign() and unassign() has been invoked
I am not sure if you guys are planning to handle this scenario totally in a different way
as from my above analysis we can infer that things largely depend on the timeoutmonitor for
the second operation to be successful.

> Handle RegionAlreadyInTransitionException in AssignmentManager
> --------------------------------------------------------------
>                 Key: HBASE-4153
>                 URL: https://issues.apache.org/jira/browse/HBASE-4153
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
> Comment from Stack over in HBASE-3741:
> {quote}
> Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException,
won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at
least one case here is saying that the region is already open on this regionserver?
> {quote}
> Indeed looking at the code it's going to be handled the same way other exceptions are.
Need to add special cases for assign and unassign.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message