hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT
Date Sat, 26 May 2012 00:07:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283830#comment-13283830
] 

Hudson commented on HBASE-6070:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #16 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/16/])
    HBASE-6070 AM.nodeDeleted and SSH races creating problems for regions under SPLIT (Ramkrishna)
(Revision 1342724)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java

                
> AM.nodeDeleted and SSH races creating problems for regions under SPLIT
> ----------------------------------------------------------------------
>
>                 Key: HBASE-6070
>                 URL: https://issues.apache.org/jira/browse/HBASE-6070
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, HBASE-6070_0.94.patch,
HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, HBASE-6070_trunk_1.patch
>
>
> We tried to address the problems in Master restart and RS restart while SPLIT region
is in progress as part of HBASE-5806.
> While doing some more we found still there is one race condition.
> -> Split has just started and the znode is in RS_SPLIT state.
> -> RS goes down.
> -> First call back for SSH comes.
> -> As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
> -> But now nodeDeleted event comes for the SPLIt node and there we try to delete the
RIT.
> -> After this we try to see in the SSH whether any node is in RIT.  As we dont find
the region in RIT the region is never assigned.
> When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So we missed
it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message