hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Hbase/HbaseArchitecture" by JimKellerman
Date Wed, 07 Feb 2007 02:40:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

The comment on the change is:
data model. define terminology

------------------------------------------------------------------------------
  
  = Table of Contents =
  
+  * [#datamodel Data Model]
+   * [#columnvaluetypes Column Value Types]
+   * [#conceptual Conceptual View]
   * [#masternode Master Node]
   * [#chubby Distributed Lock Server]
   * [#tabletserver Tablet Server]
@@ -11, +14 @@

   * [#metadata METADATA Table]
   * [#clientlib Client Library]
   * [#schema Configuration / Schema Definition]
-   * [#conceptual Conceptual Storage View]
    * [#physical Physical Storage View]
   * [#api API]
   * [#other Other]
   * [#comments Comments]
+ 
+ [[Anchor(datamodel)]]
+ = Data Model =
+ 
+ A Hbase table is a sparse, distributed, persistent, multi-dimensional
+ sorted map. The map is indexed by a row key, column key, and a
+ timestamp. Each value in the map is an uninterpreted array of bytes.
+ 
+ (row:string, column:string, time:long) -> byte[]
+ 
+ [[Anchor(columnvaluetypes)]]
+ == Column Value Types ==
+ 
+ A column may have a single value for a specified row key or it may
+ have a map of key value pairs. The former is called a ''value column''
+ or '''column''' for short, the latter is called a ''map column'' or
+ '''map''' for short.
+ 
+ Google makes no distinction between these two value types and groups
+ them under the term ''column family''. They achieve the single valued
+ column as a degenerate case of a column family. A single valued column
+ has no column key in Bigtable.
+ 
+ In the general case, Google allows arbitrary keys in a column
+ family. However, they also provide a specialization called a
+ ''locality group'' in which the column keys are limited to a specific
+ enumerated set. In the example given on page 6 of the
+ [http://labs.google.com/papers/bigtable.html Bigtable Paper], they
+ define a locality group that contains web page metadata and has
+ specific keys for language and checksums.
+ 
+ We feel that this is an unnecessary complication of the platform, and
+ will support '''columns''' and '''maps''' only. Should a client
+ application desire to implement a ''locality group'' it can do so by
+ simply restricting its map column key set.
+ 
+ [[Anchor(conceptual)]]
+ == Conceptual View ==
+ 
+ Conceptually a table may be thought of a collection of rows that
+ are located by a row key (and optional timestamp) and where any column
+ may not have a value for a particular row key (sparse). The following example is a slightly
modified form of the one on page 2 of the [http://labs.google.com/papers/bigtable.html Bigtable
Paper].
+ 
+ ||<:|2> '''Row Key''' ||<:|2> '''Time Stamp''' ||<:|2> '''Column''' ''"contents"''
|||| '''Map''' ''"anchor"'' ||<:|2> '''Column''' ''"mime"'' ||
+ ||<:> '''key''' ||<:> '''value''' ||
+ ||<^|5> "com.cnn.www" ||<:> t9 || ||<)> "cnnsi.com" ||<:> "CNN"
|| ||
+ ||<:> t8 || ||<)> "my.look.ca" ||<:> "CNN.com" || ||
+ ||<:> t6 ||<:> "<html>..." || || ||<:> "text/html" ||
+ ||<:> t5 ||<:> `"<html>..."` || || || ||
+ ||<:> t3 ||<:> `"<html>..."` || || || ||
  
  [[Anchor(masternode)]]
  = Master Node =
@@ -209, +261 @@

  [[Anchor(schema)]]
  = Configuration / Schema Definition =
  
- [[Anchor(conceptual)]]
- == Conceptual Storage View ==
- 
- Conceptually a table may be thought of a collection of rows that
- are located by a row key (and optional timestamp) and where any column
- may not have a value for a particular row key (sparse). The following example is a slightly
modified form of the one on page 2 of the [http://labs.google.com/papers/bigtable.html Bigtable
Paper].
- 
- ||<:|2> '''Row Key''' ||<:|2> '''Time Stamp''' ||<:|2> '''Column''' ''"contents:"''
|||| '''Family''' ''"anchor:"'' ||<:|2> '''Column''' ''"mime:"'' ||
- ||<:> '''key''' ||<:> '''value''' ||
- ||<^|5> "com.cnn.www" ||<:> t9 || ||<)> "cnnsi.com" ||<:> "CNN"
|| ||
- ||<:> t8 || ||<)> "my.look.ca" ||<:> "CNN.com" || ||
- ||<:> t6 ||<:> "<html>..." || || ||<:> "text/html" ||
- ||<:> t5 ||<:> `"<html>..."` || || || ||
- ||<:> t3 ||<:> `"<html>..."` || || || ||
- 
  [[Anchor(physical)]]
  == Physical Storage View ==
  

Mime
View raw message