hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "ZooKeeper/HBaseUseCases" by PatrickHunt
Date Thu, 05 Nov 2009 20:05:13 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "ZooKeeper/HBaseUseCases" page has been changed by PatrickHunt.
http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases?action=diff&rev1=6&rev2=7

--------------------------------------------------

   # master watches /regionservers for any child changes
   # as each region server becomes available to do work (or track state if up but not avail)
it creates an ephemeral node
    * /regionserver/<host:port> = <status>
-  # master watches /regionserver/<host:port> and cleans up if RS goes away
+  # master watches /regionserver/<host:port> and cleans up if RS goes away or changes
status
  
  2) task assignment (ie dynamic configuration)
   # have a /tables znode
@@ -54, +54 @@

    * RS host:port watches this node in case reassigned by master, or region changes state
   # /tables/<regionserver by host:port>/<regionX>/<state>-<seq#>
znode created by master
    * seq ensures order seen by RS
-   * RS deletes old state, oldest entry is the current state, always 1 or more znode here
-- the current state
+   * RS deletes old state znodes as it transitions out, oldest entry is the current state,
always 1 or more znode here -- the current state
+ 
+ Any metadata stored for a region znode (ie to identify)? As long as size is small no problem.
(if a bit larger consider a /regions/<regionX>  znodes which has a list of all regions
and their identity (otw r/o data fine too)
+ 
+ Numbers:
+ 
+ 1) 1001 watches by master (1001 znodes)
+ 
+ 2) Numbers for this are:
+  * 1000 watches, one each by RS on /tables (1 znode) -- really this may not be necessary,
esp after <self> is created (reduce noise by not setting when not needed)
+  * 1000 watches, one each by RS on /tables/<self> (1000 znodes)
+  * 100K watches, 100 for each RS on /tables/<self>/<region[1-100]> (100k znodes
total)
+  * if master wants to monitor region state then we're looking at 100k watches by master
+ 
+ So totally something on the order of 100k watches. No problem. ;-)
+ 
+ See [[http://bit.ly/4ekN8G|this perf doc]] for some ideas, 20 clients doing 50k watches
each - 1 million watches on a single core standalone server and still << 5ms avg response
time (async ops, keep that in mind re implementation time) YMMV of course but your numbers
are well below this. 
+ 
+ Worst-case scenarion -- cascade if all RS become disconnected
  
  [PDH end]
  

Mime
View raw message