hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MasterRewrite" by stack
Date Tue, 17 Nov 2009 23:37:12 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/MasterRewrite" page has been changed by stack.
http://wiki.apache.org/hadoop/Hbase/MasterRewrite?action=diff&rev1=14&rev2=15

--------------------------------------------------

  Current thinking is to keep region lifecycle all up in zookeeper but that won't scale. 
Postulate 100k regions -- 100TB at 1G regions -- each with two or three possible states each
with watchers for state change.  My guess is that this is too much to put in zk (Mahadev+Patrick
say no if data is small).  TODO: how to manage transition from zk to .META.?  Also, can't
do getClosest up in zk, only in .META.
  
  ===== Design =====
- Here is [[http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases#case2|Patrick's suggestion]].
 We already keep a znode per regionserver though its named for the regionservers startcode.
 On evaporation of the regionserver ephemeral node, master would run a reconciliation (or
on assumption of master roll, new master would check state in zk making sure a regionserver
per region) adding unassigned regions back to the unassigned pool.
+ Here is [[http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases#case2|Patrick's suggestion]].
 We already keep a znode per regionserver though its named for the regionservers startcode
-- see the 'rs' directory in 0.20.x zookeepers.  On evaporation of the regionserver ephemeral
node, master would run a reconciliation (or on assumption of master roll, new master would
check state in zk making sure a regionserver per region, etc.).
  
- All regions would be listed in .META. table always.  Whether they are online, splitting
or closing, etc., would be up in zk.
+ All regions would be listed in .META. table always.  Whether they are online, splitting
or closing, etc., would be up in zk.  So, figuring if something is unassigned would be case
of a .META. table scan.  Anything not managed by zk, needs to be added in there (assigned).
+ 
+ ====== zk layout ======
+ Here is some cleanup of [[http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases#case2|Patrick's
suggestion]]
+ 
+ {{{
+ # First, redo the current 'rs' directory slightly:
+ /hbase/regionservers # master watches /regionservers for any child changes
+ /hbase/regionserver/<host:port:startcode> = <status> # As each region server
becomes available to do work (or track state if up but not avail) it creates an ephemeral
node; writes state (up/down).
+ # Master watches all /regionserver/<host:port:startcode> and cleans up if RS goes
away or changes status
+ 
+ # Now, for regions
+ /hbase/regions/<regionserver by host:port:startcode> # Gets created when master notices
new region server
+ # RS host:port watches this node for any child changes 
+ 
+ /hbase/regions/<regionserver by host:port:startcode>/<regionXYZ> # znode for
each region assigned to RS host:port.
+ # RS host:port watches this node in case reassigned by master, or region changes state 
+ 
+ #
+ /tables/<regionserver by host:port:startcode>/<regionXYZ>/<state>-<seq#>
# znode created by master
+ # seq ensures order seen by RS
+ # RS deletes old state znodes as it transitions out, oldest entry is the current state,
always 1 or more znode here -- the current state 
+ }}}
+ 
+ ====== Questions ======
+ 
+ Should the region znode have state?  E.g. no flush, no compaction so we could do a backup
by copying a region at a time?
  
  <<Anchor(clean)>>
  

Mime
View raw message