hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Subbu M Iyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3210) HBASE-1921 for the new master
Date Sun, 03 Apr 2011 01:23:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015089#comment-13015089

Subbu M Iyer commented on HBASE-3210:

First draft of my patch for review.

Here is what is being done now:

1. When the primary master's Abort is triggered from ZK Node listener during a ZK session
expiry event, we first try to see if we can restore the ZK session. We ignore the abort trigger
and continue working as primary master, if we can successfully restore the ZK session.

2. A successful ZK session recovery involves the following.
   a. Create a ZK Session 
   b. Try becoming the primary master again. (so that we don't step onto secondary master's
   c. Initialize all ZK based trackers. This includes the AssignmentManager, CatalogTracker,
      RegionServerTracker and ClusterSTatusTracker.
   d. Assign Root and Meta. (We just ensure that our local memory structures are correctly
updated to reflect our earlier Root/Meta assignments)
   e. Process RIT if any, that came in during our blackout.

3. Refactored the Master startup logic so that we can reuse them during a master session recovery


> HBASE-1921 for the new master
> -----------------------------
>                 Key: HBASE-3210
>                 URL: https://issues.apache.org/jira/browse/HBASE-3210
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.92.0
>         Attachments: HBASE-3210-When_the_Master_s_session_times_out_and_there_s_only_one,_cluster_is_wedged.patch
> HBASE-1921 was lost when writing the new master code. I guess it's going to be much harder
to implement now, but I think it's a critical feature to have considering the reasons that
brought me do it in the old master. There's already a test in TestZooKeeper which has been
disabled a while ago.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message