hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Hbase/HbaseArchitecture" by JimKellerman
Date Tue, 13 Mar 2007 19:04:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

------------------------------------------------------------------------------
   1. File compaction is relatively slow; we should have a more conservative algorithm for
deciding when to apply compaction.
   1. For the getFull() operation, use of Bloom filters would speed things up
   1. We need stress-test and performance-number tools for the whole system
-  1. There's some HRegion-specific testing code that worked fine during development, but
it has to be rewritten so it works against an HRegion while it's hosted by an H!RegionServer,
and connected to an H!BaseMaster. This code is at the bottom of the HRegion.java file.
+  1. There's some HRegion-specific testing code that is a Junit test for HRegion. A new version
of this test has to be written so it works against an HRegion while it's hosted by an H!RegionServer,
and connected to an H!BaseMaster.
+  1. Implement some kind of block caching in HRegion. While the DFS isn't hitting the disk
to fetch blocks, HRegion is making IPC calls to DFS (via !MapFile)
+  1. Investigate possible performance problem or memory management issue related to random
reads. As more and more random reads are done, performance slows down and the memory footprint
increases.
+  1. Extend scanners to support iterating over all the columns in a family ("family-name:"),
and to support regular expressions for column family members.
  
  [[Anchor(comments)]]
  = Comments =

Mime
View raw message