hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7103) Need to fail split if SPLIT znode is deleted even before the split is completed.
Date Sat, 10 Nov 2012 12:51:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494652#comment-13494652

ramkrishna.s.vasudevan commented on HBASE-7103:

Ok Lars.  I understand.  No problem.  
Just before we commit this i have a suggestion
    String node = ZKAssign.getNodeName(zkw, region.getEncodedName());
    if (!ZKUtil.createEphemeralNodeAndWatch(zkw, node, data.getBytes())) {
      throw new IOException("Failed create of ephemeral " + node);
    // Transition node from SPLITTING to SPLITTING and pick up version so we
    // can be sure this znode is ours; version is needed deleting.
    return transitionNodeSplitting(zkw, region, serverName, -1);
Here after creating the node we once transit the node from SPLITTING to SPLITTING to get znode
version.  Can we get the znode version just after creating the node.
So if creation itself fails there is no node at all.  If it succeeds anyway as next step will
add the journal SET_SPLITTING_IN_ZK.
Now the transition will result in the version as 1 but if we don do the transition it will
be 0.
Now what advantage we get is next time if any parallel split comes the node will already exist
when it tries to create the znode and this will not do anything with the znode while rollback.
 What do you feel?  My intention was to solve both 7103 and 6088.  
Lars, i leave it to you.  If you think we can revert this and address this in next version
0.94.4.  If not we can try for a patch this version.  If you are ok with that i can submit
a patch for the same.
> Need to fail split if SPLIT znode is deleted even before the split is completed.
> --------------------------------------------------------------------------------
>                 Key: HBASE-7103
>                 URL: https://issues.apache.org/jira/browse/HBASE-7103
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.94.3, 0.96.0
>         Attachments: 7103-6088-revert.txt, HBASE-7103_testcase.patch
> This came up after the following mail in dev list
> 'infinite loop of RS_ZK_REGION_SPLIT on .94.2'.
> The following is the reason for the problem
> The following steps happen
> -> Initially the parent region P1 starts splitting.
> -> The split is going on normally.
> -> Another split starts at the same time for the same region P1. (Not sure why this
> -> Rollback happens seeing an already existing node.
> -> This node gets deleted in rollback and nodeDeleted Event starts.
> -> In nodeDeleted event the RIT for the region P1 gets deleted.
> -> Because of this there is no region in RIT.
> -> Now the first split gets over.  Here the problem is we try to transit the node
to SPLITTING to SPLIT. But the node even does not exist.
> But we don take any action on this.  We think it is successful.
> -> Because of this SplitRegionHandler never gets invoked.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message