Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 5733 invoked from network); 8 Mar 2009 17:25:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Mar 2009 17:25:09 -0000 Received: (qmail 50034 invoked by uid 500); 8 Mar 2009 17:25:09 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 50006 invoked by uid 500); 8 Mar 2009 17:25:09 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 49997 invoked by uid 99); 8 Mar 2009 17:25:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Mar 2009 10:25:09 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO aurora.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Mar 2009 17:25:08 +0000 Received: from aurora.apache.org (localhost [127.0.0.1]) by aurora.apache.org (8.13.8+Sun/8.13.8) with ESMTP id n28HOlr7002590 for ; Sun, 8 Mar 2009 17:24:47 GMT Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: core-commits@hadoop.apache.org Date: Sun, 08 Mar 2009 17:24:47 -0000 Message-ID: <20090308172447.2518.88630@aurora.apache.org> Subject: [Hadoop Wiki] Trivial Update of "Hbase/DesignOverview" by EvgenyRyabitskiy X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The following page has been changed by EvgenyRyabitskiy: http://wiki.apache.org/hadoop/Hbase/DesignOverview ------------------------------------------------------------------------------ An extension was added recently to allow multi-row locking, but this is not the default behavior and must be explicitly enabled. - More details are here [:Hbase/DataModel: The HBase/Bigtable Data Model] + More details are here [:Hbase/DataModel: The HBase Data Model] [[Anchor(conceptual)]] == Conceptual View == + Conceptually a table may be thought of a collection of rows that are located by a row key (and optional timestamp) and where any column + may not have a value for a particular row key (sparse). - Conceptually a table may be thought of a collection of rows that - are located by a row key (and optional timestamp) and where any column - may not have a value for a particular row key (sparse). The following example is a slightly modified form of the one on page 2 of the [http://labs.google.com/papers/bigtable.html Bigtable Paper] (adds a new column family ''"mime:"''). [[Anchor(datamodelexample)]] ||<:> '''Row Key''' ||<:> '''Time Stamp''' ||<:> '''Column''' ''"contents:"'' ||||<:> '''Column''' ''"anchor:"'' ||<:> '''Column''' ''"mime:"'' || @@ -82, +81 @@ === Row Ranges: Regions === - To an application, a table appears to be a list of tuples sorted by row key ascending, column name ascending and timestamp descending. Physically, tables are broken up into row ranges called ''regions'' (equivalent Bigtable term is ''tablet''). Each row range contains rows from start-key (inclusive) to end-key (exclusive). A set of regions, sorted appropriately, forms an entire table. Unlike Bigtable which identifies a row range by the table name and end-key, HBase identifies a row range by the table name and start-key. + To an application, a table appears to be a list of tuples sorted by row key ascending, column name ascending and timestamp descending. Physically, tables are broken up into row ranges called ''regions''. Each row range contains rows from start-key (inclusive) to end-key (exclusive). A set of regions, sorted appropriately, forms an entire table. Row range identified by the table name and start-key. - Each column family in a region is managed by an ''HStore''. Each HStore may have one or more ''!MapFiles'' (a Hadoop HDFS file type) that is very similar to a Google ''SSTable''. Like SSTables, !MapFiles are immutable once closed. !MapFiles are stored in the Hadoop HDFS. Other details are the same, except: + Each column family in a region is managed by an ''Store''. Each ''Store'' may have one or more ''!StoreFiles'' (a Hadoop HDFS file type). !StoreFilesare immutable once closed. !StoreFilesare stored in the Hadoop HDFS. Other details are the same, except: - * !MapFiles cannot currently be mapped into memory. + * !StoreFiles cannot currently be mapped into memory. - * !MapFiles maintain the sparse index in a separate file rather than at the end of the file as SSTable does. + * !StoreFiles maintain the sparse index in a separate file rather than at the end of the file as SSTable does. - * HBase extends !MapFile so that a bloom filter can be employed to enhance negative lookup performance. The hash function employed is one developed by Bob Jenkins. + * HBase extends !StoreFiles so that a bloom filter can be employed to enhance negative lookup performance. The hash function employed is one developed by Bob Jenkins. [[Anchor(arch)]] = Architecture and Implementation = There are three major components of the HBase architecture: - 1. The H!BaseMaster (analogous to the Bigtable master server) + 1. The H!BaseMaster (HBase master server) - 2. The H!RegionServer (analogous to the Bigtable tablet server) + 2. The H!RegionServer (HBase region server) 3. The HBase client, defined by org.apache.hadoop.hbase.client.HTable Each will be discussed in the following sections.