Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 12183 invoked from network); 2 Apr 2009 15:18:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Apr 2009 15:18:21 -0000 Received: (qmail 30573 invoked by uid 500); 2 Apr 2009 15:18:21 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 30517 invoked by uid 500); 2 Apr 2009 15:18:21 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 30508 invoked by uid 99); 2 Apr 2009 15:18:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Apr 2009 15:18:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO aurora.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Apr 2009 15:18:11 +0000 Received: from aurora.apache.org (localhost [127.0.0.1]) by aurora.apache.org (8.13.8+Sun/8.13.8) with ESMTP id n32FHpGv020854 for ; Thu, 2 Apr 2009 15:17:51 GMT Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: core-commits@hadoop.apache.org Date: Thu, 02 Apr 2009 15:17:51 -0000 Message-ID: <20090402151751.20781.52354@aurora.apache.org> Subject: [Hadoop Wiki] Trivial Update of "Hbase/DesignOverview" by EvgenyRyabitskiy X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The following page has been changed by EvgenyRyabitskiy: http://wiki.apache.org/hadoop/Hbase/DesignOverview ------------------------------------------------------------------------------ '''This page was created on 06.03.09 and now is in progress of construction....''' - = Table of Contents = * [#intro Introduction] * [#datamodel Data Model] * [#conceptual Conceptual View] * [#physical Physical Storage View] - * [#arch Architecture and Implementation] + * [#regions Regions(Rowranges)] + * [#design Architecture Design] * [#master HBaseMaster] * [#hregionserv HRegionServer] * [#client HBase Client] @@ -27, +27 @@ Applications store data rows in labeled tables. A data row has a sortable row key and an arbitrary number of columns. The table is stored sparsely, so that rows in the same table can have widely varying numbers of columns. - HBase is three dimensional sorted map. It maps from Cartesian product of row key, column key and a timestamp to cell value: + HBase is three dimensional sorted map. It maps from Cartesian product of row key, column key and timestamp to cell value: (row:byte[] x column:byte[] x timestamp:Long) -> byte[] @@ -82, +82 @@ However, if no timestamp is supplied, the most recent value for a particular column would be returned and would also be the first one found since timestamps are stored in descending order. Thus a request for the values of all columns in the row "com.cnn.www" if no timestamp is specified would be: the value of ''"contents:"'' from time stamp t6, the value of ''"anchor:cnnsi.com"'' from time stamp t9, the value of ''"anchor:my.look.ca"'' from time stamp t8 and the value of ''"mime:"'' from time stamp t6. - - === Row Ranges: Regions === + [[Anchor(regions)]] + === Regions (Row Ranges) === To an application, a table appears to be a list of tuples sorted by row key ascending, column name ascending and timestamp descending. Physically, tables are broken up into row ranges called ''regions''. Each row range contains rows from start-key (inclusive) to end-key (exclusive). A set of regions, sorted appropriately, forms an entire table. Row range identified by the table name and start-key. @@ -92, +92 @@ * !StoreFiles maintain the sparse index in a separate file * HBase extends !StoreFiles so that a bloom filter can be employed to enhance negative lookup performance. The hash function employed is one developed by Bob Jenkins. - [[Anchor(arch)]] - = Architecture and Implementation = + [[Anchor(design)]] + = Architecture Design = There are three major components of the HBase architecture: 1. The HMaster (HBase master server) @@ -105, +105 @@ [[Anchor(master)]] == HMaster == - There is one master HMaster per one cluster. + here is only one HMaster for a single HBase deployment. HMaster duties: - * Assigning regions to H!RegionServers + * Cluster initialization + * Assigning/unassigning regions to/from H!RegionServers (unassigning is for load balance) * Monitor the health of each H!RegionServer * Changes to the table schema and handling table administrative functions