hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseArchitecture" by JimKellerman
Date Tue, 06 Feb 2007 18:49:42 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

------------------------------------------------------------------------------
  I think Hbase should be compact (space-efficient), fast and should be able to manage high-demand
load. It should be able to handle sparse tables efficiently.
  So, for wide and sparse data, Hbase must store data by columns like C-Store does.
  
-   ''I agree. But let's not get ahead of ourselves here. I only posted the conceptual view
last night. There is no part of the document that discusses how the data is physically organized.
I was going to work on that today. Patience.'' -- JimKellerman
+  ''I agree. But let's not get ahead of ourselves here. I only posted the conceptual view
last night. There is no part of the document that discusses how the data is physically organized.
I was going to work on that today. Patience.'' -- JimKellerman
  
  A column-oriented system handles NULLs more easily with significantly smaller performance
overhead,
  and supports both Horizontal and Vertical Parallel Processing.
@@ -289, +289 @@

   * Columns are in the form of (family: optional qualifier). This is a RDF Properties 
   * Columns have type information  
  
-  ''In both Bigtable, and Hbase, there is no notion of type. Keys and values in Bigtable
are arbitrary strings. For Hbase, we are considering that values be an arbitrary byte array.''
+   ''In both Bigtable, and Hbase, there is no notion of type. Keys and values in Bigtable
are arbitrary strings. For Hbase, we are considering that values be an arbitrary byte array.''
  
-  ''Why? Bigtable is written in C++ and std::string can contain an arbitrary byte sequence.
Hbase will be written in Java and in Java Strings have an encoding associated with them. Unless
you store the original encoding of a value, you have no way to decode it back into the same
encoding.'' -- JimKellerman
+   ''Why? Bigtable is written in C++ and std::string can contain an arbitrary byte sequence.
Hbase will be written in Java and in Java Strings have an encoding associated with them. Unless
you store the original encoding of a value, you have no way to decode it back into the same
encoding.'' -- JimKellerman
  
   * Because of the design of the system, columns are easy to create (and are created implicitly)

  
-  ''In Bigtable, columns are easy to create but they require administration priviliges (Access
Control Lists control who can manipulate the schema. Hbase will follow this metaphor.'' --
JimKellerman
+   ''In Bigtable, columns are easy to create but they require administration priviliges (Access
Control Lists control who can manipulate the schema. Hbase will follow this metaphor.'' --
JimKellerman
  
   * Column families can be split into locality groups (Ontologies!) 
  

Mime
View raw message