hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase" by JimKellerman
Date Fri, 08 Feb 2008 21:04:59 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/hadoop/Hbase

The comment on the change is:
Restructuring Wiki

------------------------------------------------------------------------------
  #pragma section-numbers off
  attachment:hbase_logo_med.gif
  
- = Bigtable-like structured storage for Hadoop HDFS =
+ = HBase: Bigtable-like structured storage for Hadoop HDFS =
- 
- [[Anchor(links)]]
-  * HBase source control: https://svn.apache.org/repos/asf/hadoop/hbase/trunk
-  * [#news News]
-  * [#background Background]
-  * [wiki:Hbase/HbaseArchitecture  Hbase Architecture]
-  * [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/javadoc/org/apache/hadoop/hbase/package-summary.html#package_description
 Getting Started] description hosted inside the HBase javadoc package description or see how
to checkout, build and run hbase in about [wiki:Hbase/10Minutes 10 Minutes].
-  * [wiki:Hbase/FAQ FAQ]
-  * [wiki:Hbase/UsingBloomFilters Using Bloom Filters]
-  * [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/javadoc/org/apache/hadoop/hbase/package-summary.html
HBase API Docs] built as part of Hadoop nightlies
-  * Hbase and Performance
-   * [wiki:Hbase/PerformanceEvaluation Tools for evaluating HBase performance and scalability]
-    * There are setup instructions and a JMeter Test Plan in [https://issues.apache.org/jira/browse/HADOOP-2625
HADOOP-2625]
-   * [:Hbase/HbaseRTDS]: Discuss the evaluation of Hbase
-  * HBase discussion happens up on the HBase mailing lists:
-   * User information: [[MailTo(hbase-user AT SPAMFREE hadoop DOT apache DOT org)]]
-   * Developer information: [[MailTo(hbase-dev AT SPAMFREE hadoop DOT apache DOT org)]]
-   * See also the hadoop mailing lists [http://hadoop.apache.org/core/mailing_lists.html]
-  * Hbase IRC channel is #hbase at irc.freenode.net.
-  * [:Hbase/HbaseRest] HBase REST-gateway spec.
-  * [:Hbase/ThriftApi] HBase Thrift gateway discussion and spec.
-  * [:HBase/HBasePresentations] HBase presentations
-  * [:Hbase/MapReduce] Using HBase !MapReducing
-  * [:Hbase/IssuePriorityGuidelines] How to rate the priority of your issues in JIRA.
-  * [:Hbase/HbaseShell:Hbase Shell], a Query Language Shell for Hadoop + Hbase 
-  * [:Hbase/Jython] Accessing HBase from Jython
-  * [:Hbase/PoweredBy: PoweredBy], a list of sites and applications powered by Hbase
-  * Planning:
-   * [:Hbase/Plan-0.17: Plan for Hbase 0.17]
- 
- [[Anchor(news)]]
- == NEWS: ==
-  * HBase moves to new SVN and JIRA -- ''2008/02/04''
-  * First [http://www.eventbrite.com/event/85834734 Hbase meetup].  Hosted by rapleaf --
''2007/12/18''
-  * Paul Saab uploads 1.3B (small) two-family rows into a 24 node hbase cluster -- ''2007/12/15''
-  * Extensive refactoring of locking and addition of first version of a REST interface --
''2007/11/25''
-  * First working release of hbase is available as part of the hadoop-0.15.0 release.  See
[http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/src/contrib/hbase/CHANGES.txt?view=markup
CHANGES.txt] for release content.  [http://aa0-000-12.u.powerset.com:60010/hql.jsp?q=select+anchor%3Aanchor_text+from+enwiki%3B
Download].
-  * Cluster behavior has been much improved. The master, rather than the splitting region
server host, now rules where the daughter splits are deployed. A simple formula has been added
to spread region load evenly.  Splits have been made near-instantaneous and compaction has
been reworked so neither block updates for extended periods of time. -- ''Added 2007/08/16''
-  * Support for row and filter columns.
-  * A simple [wiki:Hbase/HbaseShell shell] for manipulating HBase tables contributed by Edward
Yoon. -- ''Added 2007/07/10'''
-  * Map/Reduce connector for HBase - contributed by Vuk Ercegovac -- ''Added 2007/06/30''
-  * Scripts to start and stop a hbase cluster have been added.  See ${HBASE_HOME}/bin. List
cluster participants in ${HBASE_HOME}/conf/regionservers file). -- ''Added 2007/06/21''
-  * A script to run distributed clients executing the Performance Evaluation tests described
in the Google Bigtable paper has been added and tested to completion running against a small
cluster of 4 region servers. See [wiki:Hbase/PerformanceEvaluation Tools for evaluating HBase
performance and scalability] -- ''Added 2007/06/21''
-  * It is now possible to add or delete column families after a table exists. Before either
of these operations the table being updated must be taken off-line (disabled) -- ''Added 2007/05/30''
-  * Data compression is available on a per-column family basis. -- ''Added 2007/05/30'' The
options are:
-   * no compression
-   * record level compression
-   * block level compression
-  * HBase now has its own component in the [https://issues.apache.org/jira/browse/HADOOP
Hadoop Jira]. Bug reports, contributions, etc. should be tagged with the component '''contrib/hbase'''.
-  * HBase is being updated frequently. The latest code can always be found in the [http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/contrib/hbase/
trunk of the Hadoop svn tree].
- 
- See the [https://issues.apache.org/jira/browse/HBASE HBase section of JIRA] for current
set of outstanding issues and recent fixes.
- 
- 
- [[Anchor(background)]]
- == Background ==
  
  Google's [http://labs.google.com/papers/bigtable.html Bigtable],
  a distributed storage system for structured data, is a very effective 
  mechanism for storing very large amounts of data in a distributed
  environment. Just as Bigtable leverages the distributed data storage provided
  by the [http://labs.google.com/papers/gfs.html Google File System],
- Hbase will provide Bigtable-like capabilities on top of Hadoop. 
+ HBase provides Bigtable-like capabilities on top of Hadoop. 
- Data is organized into tables, rows and columns. An Iterator-like interface is available
+ Data is organized into tables, rows and columns. An Iterator-like interface
- for scanning through a row range (and of course there is an ability to
+ is available for scanning through a row range (and of course there is the
- retrieve a column value for a specific key).
+ ability to retrieve a column value for a specific key).
  Any particular column may have multiple values for the same row key.
  A secondary key can be provided to select a particular value or an
  Iterator can be set up to scan through the key-value pairs for that column 
+ given a specific row key.
- given a specific row key. See [wiki:Hbase/HbaseArchitecture  Hbase Architecture]
- to learn more about Hbase.
  
- [[Anchor(rationale)]]
- === Rationale ===
+ == General Information ==
+  * [wiki:Hbase/HbaseArchitecture HBase Architecture]
+  * [wiki:Hbase/FAQ FAQ]
+  * Support:
+   * HBase IRC channel #hbase at irc.freenode.net.
+   * HBase mailing lists:
+    * User information: [[MailTo(hbase-user AT SPAMFREE hadoop DOT apache DOT org)]]
+    * Developer information: [[MailTo(hbase-dev AT SPAMFREE hadoop DOT apache DOT org)]]
+    * See also the hadoop mailing lists [http://hadoop.apache.org/core/mailing_lists.html]
+  * HBase [:HBase/News: news] and [:HBase/HBasePresentations: presentations]
+  * [:Hbase/PoweredBy: PoweredBy], a list of sites and applications powered by HBase
  
- Both Google's GFS and Hadoop's HDFS provide a mechanism to
- reliably store large amounts of data. However, there is not really a 
- mechanism for organizing the data and accessing only the parts that
- are of interest to a particular application.
+ == User Documentation ==
+  * [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/javadoc/org/apache/hadoop/hbase/package-summary.html#package_description
 Getting Started]
+  * [wiki:Hbase/10Minutes How to checkout, build and run hbase in about 10 Minutes].
+  * [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/javadoc/org/apache/hadoop/hbase/package-summary.html
HBase API Docs]
+  * [:Hbase/HbaseShell:HBase Shell], a Query Language Shell for Hadoop + HBase 
+  * [:Hbase/Jython: Jython interface to HBase]
+  * [:Hbase/HbaseRest: REST gateway specification for HBase]
+  * [:Hbase/ThriftApi: Thrift gateway specification for HBase]
+  * [:Hbase/MapReduce: Using HBase with Hadoop !MapReduce]
+  * [wiki:Hbase/UsingBloomFilters Using Bloom Filters]
+  * HBase and Performance
+   * [wiki:Hbase/PerformanceEvaluation: Tools for evaluating HBase performance and scalability]
+    * There are setup instructions and a JMeter Test Plan in [https://issues.apache.org/jira/browse/HADOOP-2625
HADOOP-2625]
+   * [:Hbase/HbaseRTDS: A performance evaluation of HBase]
  
- Bigtable (and Hbase) provide a means for organizing and efficiently
- accessing these large data sets.
+ == Developer Documentation ==
+  * Roadmaps
+   * [:Hbase/Plan-0.17: Roadmap for HBase 0.2]
+  * [:Hbase/HowToContribute How to contribute]
+  * [:Hbase/HowToCommit How to commit]
+  * [:Hbase/IssuePriorityGuidelines How to rate the priority of your issues in JIRA]
  
- [[Anchor(goals)]]
  === Goals ===
  
  Design (and subsequently implement) a structured storage system as
  similar to Google's Bigtable as possible for the Hadoop environment.
  
- [[Anchor(nongoals)]]
  ==== Non-Goals ====
  
   * Gratuitous changes that are essentially "re-inventing the wheel" or are the result of
"not invented here".
+  * For the near term features outside those outlined by the [http://labs.google.com/papers/bigtable.html
Bigtable paper]
   * Premature optimization. Once there is a working version, the system will be profiled
for hot spots.
  
- 
- [[Anchor(contributors)]]
- == Initial Contributors ==
- 
-   * Mike Cafarella (who wrote the initial code base)
-   * JimKellerman [[MailTo(jim AT SPAMFREE powerset DOT com)]]
-   * Michael Stack [[MailTo(stack AT SPAMFREE powerset DOT com)]]
- 
- [[Anchor(comments)]]
- == Comments ==
- 
- Please add comments related to the project goals and process below.
- Architectural comments should be posted on same page as the portion of
- the architecture to which the comment is directed. Thank you.
- 

Mime
View raw message