hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2700) Handle master failover for regions in transition
Date Thu, 24 Jun 2010 18:18:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882262#action_12882262
] 

Jonathan Gray commented on HBASE-2700:
--------------------------------------

In what situation does the data in ZK not have the actual state?  In order for a RS to, for
example, open a region, it must transition a node in ZK from nothing, to OPENING, to OPENED;
if this fails it does not open.  It seems to me that it is META which may not be up to date
and META which can change without the proper notifications being sent.

In style where we ask RS what they host and match that up against META, we then must do all
edits of META on master side.  Otherwise there will always be race conditions between what
master thinks is the state (via meta scan) and what the actual state is (via RS setting stuff
in meta).  ZK allows us to ensure we never miss states and transitions.

For second list of RS up in ZK, we could get this data in META but what about case where a
RS died while something was getting assigned to it but it did not finish opening and died?
 Whether this is a problem or not depends very much on who is the one who edits meta, whether
we rely on meta to determine something is not assigned, etc...

There has been consideration as to how this is handled in BT paper but I guess I just am of
the mindset that the explicit, persistent message passing via ZK is a better direction than
the meta scanning / per-rs check-in / heartbeating.  What happens if we have 1000 RS and 1M
regions?  That's a significant amount of work to do.  What if a single RS happens to be in
a 10 second GC pause?  What about race conditions between what is in META and what the RSs
know about?  What if we see in META something is unassigned but the previous master asked
an RS to open it?  That RS is in "opening" state but it is not yet assigned so would it come
back with the list of assigned regions to that server?  This is super explicit via transitions
in zk.

As for all in memory, I think we can punt on this for a while.  The only thing pertinent to
this discussion is that if holding it all in memory is possibly untenable, doesn't that mean
that it's untenable to do master failover in this style (hold every RS and its R after asking
it via RPC, and holding the META view of every R and the RS it is assigned to)?

> Handle master failover for regions in transition
> ------------------------------------------------
>
>                 Key: HBASE-2700
>                 URL: https://issues.apache.org/jira/browse/HBASE-2700
>             Project: HBase
>          Issue Type: Sub-task
>          Components: master, zookeeper
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.21.0
>
>
> To this point in HBASE-2692 tasks we have moved everything for regions in transition
into ZK, but we have not fully handled the master failover case.  This is to deal with that
and to write tests for it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message