hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7103) Need to fail split if SPLIT znode is deleted even before the split is completed.
Date Sun, 11 Nov 2012 04:49:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494825#comment-13494825

stack commented on HBASE-7103:

[~ram_krish] I don't think it possible getting version on create (Let me ask one of the zk
lads).  That is why we do the SPLITTING to SPLITTING transition to get the versoin.  Its true
though that there is a hole in here because if we fail on create, there should be no rollback
but if we fail moving SPLITTING to SPLITTING, then we should remove the created znode but
ONLY if we have its version (it could have been created by someone else).  Maybe when we create,
we write some unique data into the znode and get it after creating it to see what the version
is -- and if the unique data is not the same, we know that someone else owns the znode and
we should not rollback .... but that won't work either given it won't be backward compatible.

If we fail the create of the znode, we should not rollback. It looks like we are doing that
now since adding the STARTED_SPLITTING state -- right?  That seems wrong... should we be inserting
the STARTED_SPLITTING state after the create of the znode?  But even then, I'm not sure about
deleting a znode unless we are sure we own it -- that the version matches.

Should the following code be checking we did NOT get a -1?

        this.znodeVersion = createNodeSplitting(server.getZooKeeper(),
          this.parent.getRegionInfo(), server.getServerName());

It seems like createNodeSplitting could be returning -1 if it fails.

(Weird that transitionNode has explicit mention of M_ZK_REGION_OFFLINE and RS_ZK_REGION_OPENING
though it takes beginState and endState but that is another issue).
> Need to fail split if SPLIT znode is deleted even before the split is completed.
> --------------------------------------------------------------------------------
>                 Key: HBASE-7103
>                 URL: https://issues.apache.org/jira/browse/HBASE-7103
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.94.3, 0.96.0
>         Attachments: 7103-6088-revert.txt, HBASE-7103_testcase.patch
> This came up after the following mail in dev list
> 'infinite loop of RS_ZK_REGION_SPLIT on .94.2'.
> The following is the reason for the problem
> The following steps happen
> -> Initially the parent region P1 starts splitting.
> -> The split is going on normally.
> -> Another split starts at the same time for the same region P1. (Not sure why this
> -> Rollback happens seeing an already existing node.
> -> This node gets deleted in rollback and nodeDeleted Event starts.
> -> In nodeDeleted event the RIT for the region P1 gets deleted.
> -> Because of this there is no region in RIT.
> -> Now the first split gets over.  Here the problem is we try to transit the node
to SPLITTING to SPLIT. But the node even does not exist.
> But we don take any action on this.  We think it is successful.
> -> Because of this SplitRegionHandler never gets invoked.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message