hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/NewFileFormat" by stack
Date Fri, 23 Jan 2009 23:02:21 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/NewFileFormat

------------------------------------------------------------------------------
   * Smart getClosest and getClosestAtOrBefore [https://issues.apache.org/jira/browse/HBASE-792
hbase-792]
   * Get vs. Scan accesses.  Latter has state.
   * Sharing blocks and indices: Can have multiple Readers on a single file (e.g. many concurrent
Scanners).  If so, rather than read in index for each instance, share indices if one already
in-memory.  Same for file blocks.  Only make trip to datanode if not already instance of the
(read-only) block in mem.
+  * Version: File format should have a version so we can evolve the format.
  
  === Index ===
  TODO, but the TFile block-based rather than !MapFile interval-based would seem better for
us; indices then are of predicatable size; a seek to the index position will load at an amenable
spot when blocks are compressed. 
  
  === Nice-to-haves ===
   * Don't write out the family portion of column when writing keys [https://issues.apache.org/jira/browse/HBASE-68
HBASE-68]
+  * Row index at end of datablock.  Doesn't have to have actual row, just positions.  Can
look at current position and then at the data block index to figure next row start.
  
  === Excercise ===
  

Mime
View raw message