Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Apache Wiki <wikidiffs@apache.org>
To: core-commits@hadoop.apache.org
Date: Fri, 03 Oct 2008 19:58:22 -0000
Message-ID: <20081003195822.7804.42882@eos.apache.org>
Subject: [Hadoop Wiki] Trivial Update of "Hbase/NewFileFormat" by stack

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/NewFileFormat

------------------------------------------------------------------------------
- This page is for discussion related to [https://issues.apache.org/jira/browse/HBASE-61 HBASE-61, Create an HBase-specific MapFile implementation].  That issue, and its linked issues, has a bunch of suggestions for how we might do a better persistence.  Most have been replicated in the ''New Format'' section below.  Other related issues include, [https://issues.apache.org/jira/browse/HADOOP-3315 TFile], and [https://issues.apache.org/jira/browse/HBASE-647 HBASE-647, Remove the HStoreFile 'info' file (and index and bloomfilter if possible)].
+ This page is for discussion related to [https://issues.apache.org/jira/browse/HBASE-61 HBASE-61, Create an HBase-specific MapFile implementation].  That issue, and its linked issues, has a bunch of suggestions for how we might do a better persistence.  Most have been replicated in the ''New Format'' section below.  Other related issues include, [https://issues.apache.org/jira/browse/HADOOP-3315 TFile], and [https://issues.apache.org/jira/browse/HBASE-647 HBASE-647, Remove the HStoreFile 'info' file (and index and bloomfilter if possible)] as well as ''SSTable'' from the bigtable paper.
  
  == Current Implementation ==
  
@@ -41, +41 @@

   * Always-on General bloomfilter. We know how many entries a file will have when we go to flush it so we can optimally size a bloomfilter.  The small amount of memory a bloomfilter occupies will pay for itself many-fold in the seeks saved trying to figure is a file contains an asked for key.
   * Optimal random-access
   * Iterate over keys only, rather than mapfiles currenty key+values always.  This'd be useful when trying to find closest. TFile and SequenceFile can do this (Its not exposed in MapFile).
-  
+ 
+ === Index ===
+ TODO, but the TFile block-based rather than MapFile interval-based would seem better for us; indices then are of predicatable size; a seek to the index position will load at an amenable spot when blocks are compressed. 
  
  === Nice-to-haves ===
   * Don't write out the family portion of column when writing keys.
  
  == Other File Formats ==
+ 
  Cassandra uses a Sequence File.  It adds key/values in blocks of 128 by default.  On the 128th entry, an index for the block keys is inlined and then a new block begins.  Block offsets are kept out in an index file as in MapFile.  Bloomfilters are on by default.
  
+ From the bigtable paper, an SSTable "... contains a sequence of blocks (typically each block is 64KB in size, but this is configurable).  A block index (stored at the end of the SSTable) is used to locate blocks; the index is loaded into memory when the SSTable is opened.  A lookup can be performed with a single disk seek: we first find the appropriate block by performing a binary search in the in-memory index, and then reading the appropriate block from disk.  Optionally, an SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching the disk."
+