hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray" <jg...@apache.org>
Subject Re: Review Request: HBASE-2700 Unit test of master failover while regions in transition
Date Mon, 18 Oct 2010 18:05:52 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/995/
-----------------------------------------------------------

(Updated 2010-10-18 11:05:52.404802)


Review request for hbase and stack.


Changes
-------

Small cleanup of comments and whitespace.


Summary
-------

First go at a unit test of master failover with regions in transition.

Comment from the test method:

  /**
   * Complex test of master failover that tests as many permutations of the
   * different possible states that regions in transition could be in within ZK.
   * <p>
   * This tests the proper handling of these states by the failed-over master
   * and includes a thorough testing of the timeout code as well.
   * <p>
   * Starts with a single master and three regionservers.
   * <p>
   * Creates two tables, enabledTable and disabledTable, each containing 5
   * regions.  The disabledTable is then disabled.
   * <p>
   * After reaching steady-state, the master is killed.  We then mock several
   * states in ZK.
   * <p>
   * After mocking them, we will startup a new master which should become the
   * active master and also detect that it is a failover.  The primary test
   * passing condition will be that all regions of the enabled table are
   * assigned and all the regions of the disabled table are not assigned.
   * <p>
   * The different scenarios to be tested are below:
   * <p>
   * <b>ZK State:  OFFLINE</b>
   * <p>A node can get into OFFLINE state if</p>
   * <ul>
   * <li>An RS fails to open a region, so it reverts the state back to OFFLINE
   * <li>The Master is assigning the region to a RS before it sends RPC
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>Master has assigned an enabled region but RS failed so a region is
   *     not assigned anywhere and is sitting in ZK as OFFLINE</li>
   * <li>This seems to cover both cases?</li>
   * </ul>
   * <p>
   * <b>ZK State:  CLOSING</b>
   * <p>A node can get into CLOSING state if</p>
   * <ul>
   * <li>An RS has begun to close a region
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>Region was being closed but the RS died before finishing the close
   * <li>Region of enabled table was being closed but did not complete
   * <li>Region of disabled table was being closed but did not complete
   * </ul>
   * <p>
   * <b>ZK State:  CLOSED</b>
   * <p>A node can get into CLOSED state if</p>
   * <ul>
   * <li>An RS has completed closing a region but not acknowledged by master yet
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>Region of a table that should be enabled was closed on an RS
   * <li>Region of a table that should be disabled was closed on an RS
   * </ul>
   * <p>
   * <b>ZK State:  OPENING</b>
   * <p>A node can get into OPENING state if</p>
   * <ul>
   * <li>An RS has begun to open a region
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>RS was opening a region of enabled table but never finishes
   * </ul>
   * <p>
   * <b>ZK State:  OPENED</b>
   * <p>A node can get into OPENED state if</p>
   * <ul>
   * <li>An RS has finished opening a region but not acknowledged by master yet
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>Region of a table that should be enabled was opened on an RS
   * <li>Region of a table that should be disabled was opened on an RS
   * <li>Region of a table that should be enabled was opened by a now-dead RS
   * <li>Region of a table that should be disabled was opened by a now-dead RS
   * </ul>
   * <p>
   * <b>ZK State:  NONE</b>
   * <p>A region could not have a transition node if</p>
   * <ul>
   * <li>The server hosting the region died and no master processed it
   * </ul>
   * <p>We will mock the scenarios</p>
   * <ul>
   * <li>Region of enabled table was on a dead RS that was not yet processed
   * <li>Region of disabled table was on a dead RS that was not yet processed
   * </ul>
   * @throws Exception
   */


This addresses bug HBASE-2700.
    http://issues.apache.org/jira/browse/HBASE-2700


Diffs (updated)
-----

  trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 1023927 
  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1023927 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1023927 
  trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 1023927 
  trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java 1023927

  trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1023927

  trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 1023927

  trunk/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java 1023927 
  trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1023927 
  trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 1023927 
  trunk/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java 1023927 
  trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java 1023927 

Diff: http://review.cloudera.org/r/995/diff


Testing
-------

running the unit test!


Thanks,

Jonathan


Mime
View raw message