hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@gmail.com>
Subject Re: Review Request: Populate needed replication queues before leaving safe mode.
Date Wed, 17 Nov 2010 01:38:55 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/105/#review39
-----------------------------------------------------------



http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
<https://reviews.apache.org/r/105/#comment27>

    Please change the default to 1, so that it is backward compatible.



http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
<https://reviews.apache.org/r/105/#comment29>

    We can first check canInitializeReplQueue to optimize on CPU.



http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
<https://reviews.apache.org/r/105/#comment28>

    This can move to after the SafeMode daemon is created.


- Dhruba


On 2010-11-16 16:39:18, Patrick Kling wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/105/
> -----------------------------------------------------------
> 
> (Updated 2010-11-16 16:39:18)
> 
> 
> Review request for hadoop-hdfs.
> 
> 
> Summary
> -------
> 
> This patch introduces a new configuration variable dfs.namenode.replqueue.threshold-pct
that determines the fraction of blocks for which block reports have to be received before
the NameNode will start initializing the needed replication queues. Once a sufficient number
of block reports have been received, the queues are initialized while the NameNode is still
in safe mode. After the queues are initialized, subsequent block reports are handled by updating
the queues incrementally.
> 
> The benefit of this is twofold:
> - It allows us to compute the replication queues while we are waiting for the last few
block reports (when the NameNode is mostly idle). Once these block reports have been received,
we can then immediately leave safe mode without having to wait for the computation of the
needed replication queues (which requires a full traversal of the blocks map).
> - With Raid, it may not be necessary to stay in safe mode until all blocks have been
reported. Using this change, we could monitor if all of the missing blocks can be recreated
using parity information and if so leave safe mode early. In order for this monitoring to
work, we need access to the needed replication queues while the NameNode is still in safe
mode.
> 
> 
> This addresses bug HDFS-1476.
>     https://issues.apache.org/jira/browse/HDFS-1476
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
1035545 
>   http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BlockManager.java
1035545 
>   http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
1035545 
>   http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/MiniDFSCluster.java
1035545 
>   http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
1035545 
> 
> Diff: https://reviews.apache.org/r/105/diff
> 
> 
> Testing
> -------
> 
> new test case in TestListCorruptFileBlocks
> 
> 
> Thanks,
> 
> Patrick
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message