hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From st...@duboce.net
Subject Re: Review Request: Add separate handling of PENDING_OPEN/PENDING_CLOSE in timeout monitor and additional testing
Date Tue, 26 Oct 2010 07:55:02 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1087/
-----------------------------------------------------------

(Updated 2010-10-26 00:55:02.299835)


Review request for hbase and stack.


Changes
-------

This patch is almost there.  Its much better.  Fixed testing for .META. server by looking
in map of servers to regions; that won't work since its a map of user regions only.  Instead
get from catalogtracker.

Locally TestRegionRebalancing failed.  I need to look at that.

On cluster, we turned up an unexpected state as server was opening a region it was also going
down. Need to dig in on that too.

Want to also add tests at least for moved .meta.


Summary
-------

Adds new handling of the timeouts for PENDING_OPEN and PENDING_CLOSE in-memory master RIT
states.

Adds some new broken RIT states into TestMasterFailover.

Some of these broken states don't seem possible to me but as long as we aren't breaking the
existing behaviors and tests I think it's okay if we handle odd cases that can be mocked.
 Who knows what will happen in the real world.

The reason TestMasterFailover didn't/doesn't really test for the issue in HBASE-3147 is this
new broken condition happens when an RS dies / goes offline rather than a master failover
concurrent w/ RS failure.


v4 of the patch adds to Jons' fixes.  It adds a shutdown server handler for root and another
for meta so the processing of servers hosting meta/root do not get frozen out.  I've seen
this in my testing.


This addresses bug HBASE-3147.
    http://issues.apache.org/jira/browse/HBASE-3147


Diffs (updated)
-----

  trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
PRE-CREATION 
  trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 1027351

  trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java 1027351 
  trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1027351 
  trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 1027351 
  trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java 1027351 

Diff: http://review.cloudera.org/r/1087/diff


Testing
-------

TestMasterFailover passes.


Thanks,

Jonathan


Mime
View raw message