hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/DesignOverview" by EvgenyRyabitskiy
Date Sun, 08 Mar 2009 19:27:31 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by EvgenyRyabitskiy:
http://wiki.apache.org/hadoop/Hbase/DesignOverview

------------------------------------------------------------------------------
  = Architecture and Implementation =
  
  There are three major components of the HBase architecture:
-  1. The H!BaseMaster (HBase master server)
+  1. The HMaster (HBase master server)
   2. The H!RegionServer (HBase region server)
   3. The HBase client, defined by org.apache.hadoop.hbase.client.HTable
  
  Each will be discussed in the following sections.
  
  [[Anchor(master)]]
- == HBaseMaster ==
+ == HMaster ==
  
+ There is one master HMaster  per one cluster.
+ 
+ HMaster duties:
+ 
+  * Assigning regions to H!RegionServers
+  * Monitor the health of each H!RegionServer
+  * Changes to the table schema and handling table administrative functions
+ 
+ === Assigning regions to H!RegionServers ===
+ 
- The H!BaseMaster is responsible for assigning regions to H!RegionServers. The first region
to be assigned is the ''ROOT region'' which locates all the META regions to be assigned. Each
''META region'' maps a number of user regions which comprise the multiple tables that a particular
HBase instance serves. Once all the META regions have been assigned, the master will then
assign user regions to the H!RegionServers, attempting to balance the number of regions served
by each H!RegionServer.
+ The first region to be assigned is the ''ROOT region'' which locates all the META regions
to be assigned. Each ''META region'' maps a number of user regions which comprise the multiple
tables that a particular HBase instance serves. Once all the META regions have been assigned,
the master will then assign user regions to the H!RegionServers, attempting to balance the
number of regions served by each H!RegionServer.
  
- It also holds a pointer to the H!RegionServer that is hosting the ROOT region.
+ Location of ''ROOT region'' is stored in !ZooKeeper. 
  
- The H!BaseMaster also monitors the health of each H!RegionServer, and if it detects a H!RegionServer
is no longer reachable, it will split the H!RegionServer's write-ahead log so that there is
now one write-ahead log for each region that the H!RegionServer was serving. After it has
accomplished this, it will reassign the regions that were being served by the unreachable
H!RegionServer.
+ === Monitor the health of each H!RegionServer ===
  
- In addition, the H!BaseMaster is also responsible for handling table administrative functions
such as on/off-lining of tables, changes to the table schema (adding and removing column families),
etc.
+ If HMaster detects a H!RegionServer is no longer reachable, it will split the H!RegionServer's
write-ahead log so that there is now one write-ahead log for each region that the H!RegionServer
was serving. After it has accomplished this, it will reassign the regions that were being
served by the unreachable H!RegionServer.
  
- Unlike Bigtable, currently, when the H!BaseMaster dies, the cluster will shut down. In Bigtable,
a Tabletserver can still serve Tablets after its connection to the Master has died. We tie
them together, because we do not currently use an external lock-management system like Bigtable.
The Bigtable Master allocates tablets and a lock manager (''Chubby'') guarantees atomic access
by Tabletservers to tablets. HBase uses just a single central point for all H!RegionServers
to access: the H!BaseMaster.
+ == Changes to the table schema and handling table administrative functions ==
+ 
+ Table schema is set of tables and it's column families. HMaster can add and remove column
families, turn on/off tables.
+ 
+ If HMaster dies, the cluster will shut down, but it will be changed soon after integration
with !ZooKeeper. See [:Hbase/ZookeeperIntegration: ZooKeeper Integration]
  
  === The META Table ===
  
@@ -121, +135 @@

  
  Each row in the ROOT and META tables is approximately 1KB in size. At the default region
size of 256MB, this means that the ROOT region can map 2.6 x 10^5^ META regions, which in
turn map a total 6.9 x 10^10^ user regions, meaning that approximately 1.8 x 10^19^ (2^64^)
bytes of user data.
  
+ Every server (master or region) can get ''ROOT region'' location from !ZooKeeper. 
+ 
  [[Anchor(hregion)]]
- == HRegionServer ==
+ == H!RegionServer ==
  
- The H!RegionServer is responsible for handling client read and write requests. It communicates
with the H!BaseMaster to get a list of regions to serve and to tell the master that it is
alive. Region assignments and other instructions from the master "piggy back" on the heart
beat messages.
+ H!RegionServer duties:
+ 
+  * Serving HRigions assigned to H!RegionServer
+  * Handling client read and write requests
+  * Flushing cache to HDFS
+  * Keeping HLog
+  * Region Compactions and Splits
+ 
+ === Handling client read and write requests ===
+ 
+ Client communicates with the HMaster to get a list of HRegions to serve and to tell the
master that it is alive. Region assignments and other instructions from the master "piggy
back" on the heart beat messages.
  
  === Write Requests ===
  
- When a write request is received, it is first written to a write-ahead log called a ''HLog''.
All write requests for every region the region server is serving are written to the same log.
Once the request has been written to the HLog, it is stored in an in-memory cache called the
''Memcache''. There is one Memcache for each HStore.
+ When a write request is received, it is first written to a write-ahead log called a ''HLog''.
All write requests for every region the region server is serving are written to the same log.
Once the request has been written to the HLog, it is stored in an in-memory cache called the
''Memcache''. There is one Memcache for each Store.
  
  === Read Requests ===
  

Mime
View raw message