hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Kling" <pkl...@cs.uwaterloo.ca>
Subject Review Request: Populate needed replication queues before leaving safe mode.
Date Wed, 17 Nov 2010 00:39:18 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/105/
-----------------------------------------------------------

Review request for hadoop-hdfs.


Summary
-------

This patch introduces a new configuration variable dfs.namenode.replqueue.threshold-pct that
determines the fraction of blocks for which block reports have to be received before the NameNode
will start initializing the needed replication queues. Once a sufficient number of block reports
have been received, the queues are initialized while the NameNode is still in safe mode. After
the queues are initialized, subsequent block reports are handled by updating the queues incrementally.

The benefit of this is twofold:
- It allows us to compute the replication queues while we are waiting for the last few block
reports (when the NameNode is mostly idle). Once these block reports have been received, we
can then immediately leave safe mode without having to wait for the computation of the needed
replication queues (which requires a full traversal of the blocks map).
- With Raid, it may not be necessary to stay in safe mode until all blocks have been reported.
Using this change, we could monitor if all of the missing blocks can be recreated using parity
information and if so leave safe mode early. In order for this monitoring to work, we need
access to the needed replication queues while the NameNode is still in safe mode.


This addresses bug HDFS-1476.
    https://issues.apache.org/jira/browse/HDFS-1476


Diffs
-----

  http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
1035545 
  http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BlockManager.java
1035545 
  http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
1035545 
  http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java
1035545 
  http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
1035545 

Diff: https://reviews.apache.org/r/105/diff


Testing
-------

new test case in TestListCorruptFileBlocks


Thanks,

Patrick


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message