hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4497) If region opening fails after updating META HBCK reports it as inconsistent and scanning the region throws NSRE
Date Fri, 30 Sep 2011 04:52:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117881#comment-13117881
] 

Ming Ma commented on HBASE-4497:
--------------------------------

1. Agree checkAndPut solution is good enough. I am just trying to find holes here.:)
2. Does RS need to have access to global counter? If it is only for region assignment scenario,
agree there is no such need. I initially thought of it as a "region operation id" where RS
will also get a new ID when state changes, for example from OPENING to OPENED. We will use
such counter to track every region state change in the system.
3. Persistent .vs. ephemeral. I thought there will be a way to provide reliable ZK based AtomicLong
that can survive HBase, ZK reliable restart. That will give us a good pictures of the event
sequence in the system. Performance isn't that important given region state happens less frequently.
4. unique .vs. monotonically increase. For this issue, unique number seems to be fine. I thought
it might be used in other context to track event sequence. So monotonically increase is better
given the comparison of two values can indicate the order in time dimension. It doesn't have
to be sequential.
                
> If region opening fails after updating META HBCK reports it as inconsistent and scanning
the region throws NSRE
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4497
>                 URL: https://issues.apache.org/jira/browse/HBASE-4497
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Priority: Critical
>
> As per the discussion in the mail chain "HBCK reporting of possible mismatch in RS assignment"
this JIRA is created.
> Consider two RS-> RS1 and RS2.
> A region tries to open in RS1. But it takes a while.  The RS1 has still not updated meta
and transitioned the node from OPENING to OPENED
> So timeout assigns the region to RS2.  RS2 successfully updates the META and opens the
region.
> Now RS1 tries to act on the region by first updating the META and then transiting the
node to OPENING to OPENED.
> RS1 transiting the node to OPENING to OPENED will fail.  But the META entry will have
RS1 as the latest.
> Now HBCK reports this as an inconsistency and if we try to scan the Region we get NotServingRegionException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message