hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4497) If region opening fails after updating META HBCK reports it as inconsistent and scanning the region throws NSRE
Date Wed, 28 Sep 2011 06:36:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116232#comment-13116232
] 

Ming Ma commented on HBASE-4497:
--------------------------------

ok, Ram.

Add some more clarification.

1. It looks ZKAssign.transitionNode has provided atomicity via "expected version" feature
in ZK. So we are good here.
2. Global AtomicInteger isn't necessary in this context, we can just use the "expected version"
from ZK for a given ZNode, given "expected version" just need to be unique on a given ZNode,
not global.
3. With regard to HBase .META. update, we can put "expected version" as ID into the .META.
table and enforce new update's ID has to be greater than the previous version for a given
region via some new HBase API checkGreaterAndPut. This ID value is local to the region node,
that should be ok; for a given region node, this value will increment all the time. Currently
this "expected version" is passed via RPC RegionOpeningState openRegion(HRegionInfo region,
int versionOfOfflineNode). Will that address the issue, Jonathan?



Jonathan Dhruba's suggestion is interesting. Could scale be an issue when HBase scales to
the next level in terms of number of machines, number of regions and number of region movements?
.META. table will be distributed to different RSs, putting it on the Master could be a bottleneck.
However, we might first run into other more important issues in such large scale.
                
> If region opening fails after updating META HBCK reports it as inconsistent and scanning
the region throws NSRE
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4497
>                 URL: https://issues.apache.org/jira/browse/HBASE-4497
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Priority: Critical
>
> As per the discussion in the mail chain "HBCK reporting of possible mismatch in RS assignment"
this JIRA is created.
> Consider two RS-> RS1 and RS2.
> A region tries to open in RS1. But it takes a while.  The RS1 has still not updated meta
and transitioned the node from OPENING to OPENED
> So timeout assigns the region to RS2.  RS2 successfully updates the META and opens the
region.
> Now RS1 tries to act on the region by first updating the META and then transiting the
node to OPENING to OPENED.
> RS1 transiting the node to OPENING to OPENED will fail.  But the META entry will have
RS1 as the latest.
> Now HBCK reports this as an inconsistency and if we try to scan the Region we get NotServingRegionException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message