Return-Path: X-Original-To: apmail-hadoop-hdfs-commits-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 65B39D5A2 for ; Thu, 28 Jun 2012 17:55:16 +0000 (UTC) Received: (qmail 26190 invoked by uid 500); 28 Jun 2012 17:55:16 -0000 Delivered-To: apmail-hadoop-hdfs-commits-archive@hadoop.apache.org Received: (qmail 26134 invoked by uid 500); 28 Jun 2012 17:55:15 -0000 Mailing-List: contact hdfs-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-commits@hadoop.apache.org Received: (qmail 26118 invoked by uid 99); 28 Jun 2012 17:55:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2012 17:55:15 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2012 17:55:14 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 205A8238896F; Thu, 28 Jun 2012 17:54:54 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1355089 - in /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs: ./ src/main/java/org/apache/hadoop/hdfs/ src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ src/main/resources/ Date: Thu, 28 Jun 2012 17:54:53 -0000 To: hdfs-commits@hadoop.apache.org From: eli@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120628175454.205A8238896F@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: eli Date: Thu Jun 28 17:54:52 2012 New Revision: 1355089 URL: http://svn.apache.org/viewvc?rev=1355089&view=rev Log: HDFS-3475. Make the replication monitor multipliers configurable. Contributed by Harsh J Chouraria Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt?rev=1355089&r1=1355088&r2=1355089&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt (original) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Thu Jun 28 17:54:52 2012 @@ -250,6 +250,9 @@ Branch-2 ( Unreleased changes ) HDFS-3572. Cleanup code which inits SPNEGO in HttpServer (todd) + HDFS-3475. Make the replication monitor multipliers configurable. + (harsh via eli) + OPTIMIZATIONS HDFS-2982. Startup performance suffers when there are many edit log Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java?rev=1355089&r1=1355088&r2=1355089&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java (original) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java Thu Jun 28 17:54:52 2012 @@ -165,6 +165,14 @@ public class DFSConfigKeys extends Commo public static final String DFS_DATANODE_SOCKET_REUSE_KEEPALIVE_KEY = "dfs.datanode.socket.reuse.keepalive"; public static final int DFS_DATANODE_SOCKET_REUSE_KEEPALIVE_DEFAULT = 1000; + // Replication monitoring related keys + public static final String DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION = + "dfs.namenode.invalidate.work.pct.per.iteration"; + public static final int DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION_DEFAULT = 32; + public static final String DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION = + "dfs.namenode.replication.work.multiplier.per.iteration"; + public static final int DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION_DEFAULT = 2; + //Delegation token related keys public static final String DFS_NAMENODE_DELEGATION_KEY_UPDATE_INTERVAL_KEY = "dfs.namenode.delegation.key.update-interval"; public static final long DFS_NAMENODE_DELEGATION_KEY_UPDATE_INTERVAL_DEFAULT = 24*60*60*1000; // 1 day Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java?rev=1355089&r1=1355088&r2=1355089&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java (original) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java Thu Jun 28 17:54:52 2012 @@ -68,6 +68,7 @@ import org.apache.hadoop.net.Node; import org.apache.hadoop.util.Daemon; import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Preconditions; import com.google.common.collect.Sets; /** @@ -193,6 +194,9 @@ public class BlockManager { /** value returned by MAX_CORRUPT_FILES_RETURNED */ final int maxCorruptFilesReturned; + final float blocksInvalidateWorkPct; + final int blocksReplWorkMultiplier; + /** variable to enable check for enough racks */ final boolean shouldCheckForEnoughRacks; @@ -245,7 +249,25 @@ public class BlockManager { this.maxReplicationStreams = conf.getInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY, DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_DEFAULT); this.shouldCheckForEnoughRacks = conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null; - + + this.blocksInvalidateWorkPct = conf.getFloat( + DFSConfigKeys.DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION, + DFSConfigKeys.DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION_DEFAULT); + Preconditions.checkArgument( + (this.blocksInvalidateWorkPct > 0), + DFSConfigKeys.DFS_NAMENODE_INVALIDATE_WORK_PCT_PER_ITERATION + + " = '" + this.blocksInvalidateWorkPct + "' is invalid. " + + "It should be a positive, non-zero float value " + + "indicating a percentage."); + this.blocksReplWorkMultiplier = conf.getInt( + DFSConfigKeys.DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION, + DFSConfigKeys.DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION_DEFAULT); + Preconditions.checkArgument( + (this.blocksReplWorkMultiplier > 0), + DFSConfigKeys.DFS_NAMENODE_REPLICATION_WORK_MULTIPLIER_PER_ITERATION + + " = '" + this.blocksReplWorkMultiplier + "' is invalid. " + + "It should be a positive, non-zero integer value."); + this.replicationRecheckInterval = conf.getInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_INTERVAL_KEY, DFSConfigKeys.DFS_NAMENODE_REPLICATION_INTERVAL_DEFAULT) * 1000L; @@ -2897,8 +2919,6 @@ assert storedBlock.findDatanode(dn) < 0 * Periodically calls computeReplicationWork(). */ private class ReplicationMonitor implements Runnable { - private static final int INVALIDATE_WORK_PCT_PER_ITERATION = 32; - private static final int REPLICATION_WORK_MULTIPLIER_PER_ITERATION = 2; @Override public void run() { @@ -2938,9 +2958,9 @@ assert storedBlock.findDatanode(dn) < 0 final int numlive = heartbeatManager.getLiveDatanodeCount(); final int blocksToProcess = numlive - * ReplicationMonitor.REPLICATION_WORK_MULTIPLIER_PER_ITERATION; + * this.blocksReplWorkMultiplier; final int nodesToProcess = (int) Math.ceil(numlive - * ReplicationMonitor.INVALIDATE_WORK_PCT_PER_ITERATION / 100.0); + * this.blocksInvalidateWorkPct); int workFound = this.computeReplicationWork(blocksToProcess); Modified: hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml?rev=1355089&r1=1355088&r2=1355089&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml (original) +++ hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml Thu Jun 28 17:54:52 2012 @@ -870,5 +870,35 @@ ${dfs.web.authentication.kerberos.principal} + + dfs.namenode.invalidate.work.pct.per.iteration + 0.32f + + *Note*: Advanced property. Change with caution. + This determines the percentage amount of block + invalidations (deletes) to do over a single DN heartbeat + deletion command. The final deletion count is determined by applying this + percentage to the number of live nodes in the system. + The resultant number is the number of blocks from the deletion list + chosen for proper invalidation over a single heartbeat of a single DN. + Value should be a positive, non-zero percentage in float notation (X.Yf), + with 1.0f meaning 100%. + + + + + dfs.namenode.replication.work.multiplier.per.iteration + 2 + + *Note*: Advanced property. Change with caution. + This determines the total amount of block transfers to begin in + parallel at a DN, for replication, when such a command list is being + sent over a DN heartbeat by the NN. The actual number is obtained by + multiplying this multiplier with the total number of live nodes in the + cluster. The result number is the number of blocks to begin transfers + immediately for, per DN heartbeat. This number can be any positive, + non-zero integer. + +