hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Hbase/HbaseArchitecture" by JimKellerman
Date Mon, 24 Sep 2007 22:46:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:

The comment on the change is:
add issue: when is region server dead?

  The multi-machine stuff (the HMaster and the H!RegionServer) are actively being enhanced
and debugged.
- Other related features and TODOs:
+ Issues and TODOs:
+  1. How do we know if a region server is really dead, or if the network is partitioned or
if the region server is merely late in reporting in or getting its lease renewed? If we decide
that a region server is dead, and it is not, it could still be doing updates on behalf of
clients, adding to its log. It is not until it does successfully report in that it knows the
master has "delisted" it. Only at that point does it start flushing the cache, finishing the
log, etc. In the mean time the master may be ripping the rug out from under it by trying to
split its log file (the most recent of which will be zero length because it is visible, but
has no content until the region server closes it), and may have already reassigned the regions
being served by the region server to another one, which will at a minimum lose data, and in
the worst case, corrupt the region. This issue is being addressed in [https://issues.apache.org/jira/browse/HADOOP-1937
   1. Vuk Ercegovac [[MailTo(vercego AT SPAMFREE us DOT ibm DOT com)]] of IBM Almaden Research
pointed out that keeping HBase HRegion edit logs in HDFS is currently flawed.  HBase writes
edits to logs and to a memcache.  The 'atomic' write to the log is meant to serve as insurance
against abnormal !RegionServer exit: on startup, the log is rerun to reconstruct an HRegion's
last wholesome state. But files in HDFS do not 'exist' until they are cleanly closed -- something
that will not happen if !RegionServer exits without running its 'close'.
   1. The HMemcache lookup structure is relatively inefficient
   1. Implement some kind of block caching in HRegion. While the DFS isn't hitting the disk
to fetch blocks, HRegion is making IPC calls to DFS (via !MapFile)

View raw message