Return-Path: Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: (qmail 43022 invoked from network); 18 Nov 2010 18:47:07 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Nov 2010 18:47:07 -0000 Received: (qmail 93511 invoked by uid 500); 18 Nov 2010 18:47:38 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 93415 invoked by uid 500); 18 Nov 2010 18:47:37 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 93407 invoked by uid 99); 18 Nov 2010 18:47:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Nov 2010 18:47:37 +0000 X-ASF-Spam-Status: No, hits=-1994.2 required=10.0 tests=ALL_TRUSTED,FS_REPLICA,HTML_MESSAGE X-Spam-Check-By: apache.org Received: from [140.211.11.40] (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Nov 2010 18:47:35 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id oAIIncke021598; Thu, 18 Nov 2010 18:49:38 GMT Content-Type: multipart/alternative; boundary="===============2856824251335467613==" MIME-Version: 1.0 Subject: Re: Review Request: Populate needed replication queues before leaving safe mode. From: "Patrick Kling" To: "Dhruba Borthakur" , "Patrick Kling" , "hadoop-hdfs" Date: Thu, 18 Nov 2010 18:49:38 -0000 Message-ID: <20101118184938.6351.17505@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org X-ReviewRequest-URL: https://reviews.apache.org/r/105/ In-Reply-To: <20101117020144.6351.48411@reviews.apache.org> References: <20101117020144.6351.48411@reviews.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org --===============2856824251335467613== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/105/ ----------------------------------------------------------- (Updated 2010-11-18 10:49:38.102334) Review request for hadoop-hdfs. Changes ------- Changed default value of replication queue threshold to safe mode threshold. Summary ------- This patch introduces a new configuration variable dfs.namenode.replqueue.t= hreshold-pct that determines the fraction of blocks for which block reports= have to be received before the NameNode will start initializing the needed= replication queues. Once a sufficient number of block reports have been re= ceived, the queues are initialized while the NameNode is still in safe mode= . After the queues are initialized, subsequent block reports are handled by= updating the queues incrementally. The benefit of this is twofold: - It allows us to compute the replication queues while we are waiting for t= he last few block reports (when the NameNode is mostly idle). Once these bl= ock reports have been received, we can then immediately leave safe mode wit= hout having to wait for the computation of the needed replication queues (w= hich requires a full traversal of the blocks map). - With Raid, it may not be necessary to stay in safe mode until all blocks = have been reported. Using this change, we could monitor if all of the missi= ng blocks can be recreated using parity information and if so leave safe mo= de early. In order for this monitoring to work, we need access to the neede= d replication queues while the NameNode is still in safe mode. This addresses bug HDFS-1476. https://issues.apache.org/jira/browse/HDFS-1476 Diffs (updated) ----- http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/had= oop/hdfs/DFSConfigKeys.java 1035545 = http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/had= oop/hdfs/server/namenode/BlockManager.java 1035545 = http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/had= oop/hdfs/server/namenode/FSNamesystem.java 1035545 = http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apach= e/hadoop/hdfs/MiniDFSCluster.java 1035545 = http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/test/hdfs/org/apach= e/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java 1035545 = Diff: https://reviews.apache.org/r/105/diff Testing ------- new test case in TestListCorruptFileBlocks Thanks, Patrick --===============2856824251335467613==--