hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
Date Thu, 08 Sep 2011 15:27:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100386#comment-13100386
] 

ramkrishna.s.vasudevan commented on HBASE-4153:
-----------------------------------------------

Pls find the analysis for the following state transitions

This is how i tried to simulate the scenarios
Create some 7 or 8 regions.
Using HBaseAdmin call Unassign(regionname, false) and assign(regionname, false) parallely.
See what happens when both operations go on parallel.

Correct me if am wrong.  Pls provide your suggestions.

1) Close	Close -> No problem
2) Close        Open 
Here we depend on the timeout
 Assume the closing is in partial state
 -> After setting the node to CLOSED state 
	Here the closing is done successfully but the problem is to open we need to
	wait for the timeout monitor to deduce that the region is in RIT as the inmemory
	state is put to OFFLINE once RegionAlreadyInTransitionExceptionHappens
 -> Before setting the node to CLOSED state 
	Here the problem is that closing is not done properly and also open also fails
	putting the inmemory state to OFFLINE
	The closing itself fails because when we try to assign the region it forcefully
	moves the znode to OFFLINE. so close is not able to move from CLOSING to CLOSED
May be if we get an RegionAlreadyInTransition just dont update the memory state to OFFLINE.
Either the previous open should be successful or even if it fails the PENDING_OPEN state 
timeout transition will any way happen

3) Open		Open
This is causing problem.
The thing here is assume one open region is in progress.
The next open region just fails and adds in memory state to OFFLINE.
Now the first open region gets completed and moves it to OPENED.
In handling of OPENED state
{code}
          if (regionState == null ||
              (!regionState.isPendingOpen() && !regionState.isOpening())) {
            LOG.warn("Received OPENED for region " +
                prettyPrintedRegionName +
                " from server " + data.getOrigin() + " but region was in " +
                " the state " + regionState + " and not " +
                "in expected PENDING_OPEN or OPENING states");
            return;
{code}
we have the above code.  Hence never the region can be added to master's online list.
This scenario is what has been handled in HBASE-4015 patch when a race happens between
forcing the node to OFFLINE and by the time OPENING has happened.
{code}
+      // If we are reassigning the node do not force in-memory state to OFFLINE.
+      // Based on the znode state we will decide if to change
+      // in-memory state to OFFLINE or not. It will
+      // be done before setting the znode to OFFLINE state.
+      if (!hijackAndPreempt) {
+        LOG.debug("Forcing OFFLINE; was=" + state);
+        state.update(RegionState.State.OFFLINE);
+      }
{code}
4)Open		Close
This will not be a seperate case in my testing.  As once we call unassign() region it will
any way
call assign once closing is successful.  Hence it ends up in any one of the three.


> Handle RegionAlreadyInTransitionException in AssignmentManager
> --------------------------------------------------------------
>
>                 Key: HBASE-4153
>                 URL: https://issues.apache.org/jira/browse/HBASE-4153
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>
> Comment from Stack over in HBASE-3741:
> {quote}
> Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException,
won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at
least one case here is saying that the region is already open on this regionserver?
> {quote}
> Indeed looking at the code it's going to be handled the same way other exceptions are.
Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message