hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "ZooKeeper/HBaseUseCases" by PatrickHunt
Date Thu, 05 Nov 2009 19:43:14 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "ZooKeeper/HBaseUseCases" page has been changed by PatrickHunt.


  === Case 2 ===
  Summary: HBase Region Transitions from unassigned to open and from open to unassigned with
some intermediate states
  Expected scale: 100k regions across thousands of RegionServers
+ [PDH start]
+ This sounds like 2 recipes -- "dynamic configuration" ("dynamic sharding", same thing except
the data may be a bit larger) and "group membership". Basically you want to have a list of
region servers that are available to do work. You also want a master to coordinate the work
among the region servers. You also want to ensure that the work handed to the RS is acted
upon in order (state transitions) and would like to know the status of the work at any point
in time. So really I see two recipes here:
+ Here's an idea, see if I got the idea right, obv would have to flesh this out more but this
is the general idea. I've chosen random paths below, obv you'd want some sort of prefix, better
names, etc...
+ 1) group membership:
+  # have a /regionservers znode
+  # master watches /regionservers for any child changes
+  # as each region server becomes available to do work (or track state if up but not avail)
it creates an ephemeral node
+   * /regionserver/<host:port> = <status>
+  # master watches /regionserver/<host:port> and cleans up if RS goes away
+ 2) task assignment (ie dynamic configuration)
+  # have a /tables znode
+  # /tables/<regionserver by host:port> which gets created when master notices new
region server
+   * RS host:port watches this node for any child changes
+  # /tables/<regionserver by host:port>/<regionX> znode for each region assigned
to RS host:port
+   * RS host:port watches this node in case reassigned by master, or region changes state
+  # /tables/<regionserver by host:port>/<regionX>/<state>-<seq#>
znode created by master
+   * seq ensures order seen by RS
+   * RS deletes old state, oldest entry is the current state, always 1 or more znode here
-- the current state
+ [PDH end]
  General recipe implemented: None yet.  Need help.  Was thinking of keeping queues up in
zk -- queues per regionserver for it to open/close etc.  But the list of all regions is kept
elsewhere currently and probably for the foreseeable future out in our .META. catalog table.
 Some further description can be found here [[http://wiki.apache.org/hadoop/Hbase/MasterRewrite#regionstate|Master
Rewrite: Region State]]

View raw message