hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aravind Gottipati (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4298) Support to drain RS nodes through ZK
Date Mon, 26 Sep 2011 22:41:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114991#comment-13114991

Aravind Gottipati commented on HBASE-4298:

@Ted: Thank you for the review.  I made some changes and updated my patch (in github).  Notes
in line.

* I think we only need to log the number of draining servers.
* The javadoc should state that keys of the map are region servers.
* For DrainingServerTracker.java, please remove year.
* For nodeChildrenChanged(), please change the sentence for catch black of IOException, it
mentioned zk exception.
* Should remove the createNode rather than just comment it out.
* serverManager methods is different between add() and remove(): one inside synchronized block,
one outside.
* I think a better name maybe "zookeeper.znode.draining.rs"

- I agree with all of these and they are all fixed in my latest code push (on github).

* I wonder if Map is needed for drainingServers because it is private and getDrainingServersList()
only returns the keySet.
- The map isn't required, but I followed the example of onlineServers and serverConnections.
 For code in trunk, I have changed it to a ArrayList.  A similar change does not work (easily)
in the 0.90 branch.  Code in AssignmentManager uses HServerInfo in 0.90, and changing drainingServers
to an array list will mean key lookups etc.  I have left it as a Map in 0.90, but I changed
it to a list in trunk.

* removeServerFromDrainList / addServerToDrainList should return a boolean.
- The remove and add methods are called from DrainingServerTracker.  The context is a ZK callback,
and the corresponding remove and add functions there simply return voids.  I changed the code
to return booleans in trunk, but left it as void in the 0.90 branch. I figured they might
actually be used in trunk, but I doubt they will be in 0.90.

* Unit tests..
- I will work with Stack and get the tests to you.

* Share your experience from using this in your environment.
- To reboot the cluster, we currently drain one server at a time (using the graceful stop
shell script).  This process takes forever to go through all the servers.  The goal here is
to enable us to drain multiple servers simultaneously.  Doing this by keeping track of servers
externally makes the programming painful, and we'd have to share state somehow between different
scripts that all aim to drain different servers.  Leaving this list in ZK and having HBase
keep them from getting new regions seems like the right way to go about it.  I have tested
this in a test cluster of about 14 servers.  This code by itself only solves one part of our
problem.  The rest of it will be solved by command line scripts that will create nodes to
be shut down under /draining/rs in ZK, and then move regions out from them.

Please let me know if you have any other questions about this stuff.

> Support to drain RS nodes through ZK
> ------------------------------------
>                 Key: HBASE-4298
>                 URL: https://issues.apache.org/jira/browse/HBASE-4298
>             Project: HBase
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 0.90.4
>         Environment: all
>            Reporter: Aravind Gottipati
>            Priority: Critical
>              Labels: patch
>             Fix For: 0.92.0, 0.90.5
> HDFS currently has a way to exclude certain datanodes and prevent them from getting new
blocks.  HDFS goes one step further and even drains these nodes for you.  This enhancement
is a step in that direction.
> The idea is that we mark nodes in zookeeper as draining nodes.  This means that they
don't get any more new regions.  These draining nodes look exactly the same as the corresponding
nodes in /rs, except they live under /draining.
> Eventually, support for draining them can be added.  I am submitting two patches for
review - one for the 0.90 branch and one for trunk (in git).
> Here are the two patches
> 0.90 - https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
> trunk - https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
> I have tested both these patches and they work as advertised.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message