Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 81901 invoked from network); 27 Feb 2009 22:49:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Feb 2009 22:49:45 -0000 Received: (qmail 65883 invoked by uid 500); 27 Feb 2009 22:49:45 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 65856 invoked by uid 500); 27 Feb 2009 22:49:45 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 65847 invoked by uid 99); 27 Feb 2009 22:49:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Feb 2009 14:49:45 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Feb 2009 22:49:35 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 2A0A02388995; Fri, 27 Feb 2009 22:49:14 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r748728 - in /hadoop/core/trunk: ./ src/docs/src/documentation/content/xdocs/ src/hdfs/org/apache/hadoop/hdfs/ src/hdfs/org/apache/hadoop/hdfs/protocol/ src/hdfs/org/apache/hadoop/hdfs/server/namenode/ src/hdfs/org/apache/hadoop/hdfs/tools/... Date: Fri, 27 Feb 2009 22:49:13 -0000 To: core-commits@hadoop.apache.org From: szetszwo@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20090227224914.2A0A02388995@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: szetszwo Date: Fri Feb 27 22:49:12 2009 New Revision: 748728 URL: http://svn.apache.org/viewvc?rev=748728&view=rev Log: HADOOP-5144. Add a new DFSAdmin command for changing the setting of restore failed storage replicas in namenode. (Boris Shkolnik via szetszwo) Modified: hadoop/core/trunk/CHANGES.txt hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImage.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNode.java hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java hadoop/core/trunk/src/test/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java Modified: hadoop/core/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/core/trunk/CHANGES.txt?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/CHANGES.txt (original) +++ hadoop/core/trunk/CHANGES.txt Fri Feb 27 22:49:12 2009 @@ -49,6 +49,9 @@ HADOOP-4927. Adds a generic wrapper around outputformat to allow creation of output on demand (Jothi Padmanabhan via ddas) + HADOOP-5144. Add a new DFSAdmin command for changing the setting of restore + failed storage replicas in namenode. (Boris Shkolnik via szetszwo) + IMPROVEMENTS HADOOP-4565. Added CombineFileInputFormat to use data locality information Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml (original) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml Fri Feb 27 22:49:12 2009 @@ -480,6 +480,7 @@ Usage: hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] + [-restoreFailedStorage true|false|check] [-help [cmd]]

@@ -548,6 +549,12 @@ It does not fault if the directory has no quota. + + + + Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java Fri Feb 27 22:49:12 2009 @@ -730,6 +730,17 @@ throw re.unwrapRemoteException(AccessControlException.class); } } + + /** + * enable/disable restore failed storage. + * See {@link ClientProtocol#restoreFailedStorage()} + * for more details. + * + * @see ClientProtocol#restoreFailedStorage() + */ + boolean restoreFailedStorage(String arg) throws AccessControlException { + return namenode.restoreFailedStorage(arg); + } /** * Refresh the hosts and exclude files. (Rereads them.) Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java Fri Feb 27 22:49:12 2009 @@ -363,6 +363,16 @@ } /** + * enable/disable/check restoreFaileStorage + * + * @see org.apache.hadoop.hdfs.protocol.ClientProtocol#restoreFailedStorage() + */ + public boolean restoreFailedStorage(String arg) throws AccessControlException { + return dfs.restoreFailedStorage(arg); + } + + + /** * Refreshes the list of hosts and excluded hosts from the configured * files. */ Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/protocol/ClientProtocol.java Fri Feb 27 22:49:12 2009 @@ -41,9 +41,9 @@ * Compared to the previous version the following changes have been introduced: * (Only the latest change is reflected. * The log of historical changes can be retrieved from the svn). - * 42: updated to use sticky bit + * 43: added restoreFailedStorage command */ - public static final long versionID = 42L; + public static final long versionID = 43L; /////////////////////////////////////// // File contents @@ -375,6 +375,15 @@ public void saveNamespace() throws IOException; /** + * Enable/Disable restore failed storage. + *

+ * sets flag to enable restore of failed storage replicas + * + * @throws AccessControlException if the superuser privilege is violated. + */ + public boolean restoreFailedStorage(String arg) throws AccessControlException; + + /** * Tells the namenode to reread the hosts and exclude files. * @throws IOException */ Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImage.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImage.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImage.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSImage.java Fri Feb 27 22:49:12 2009 @@ -119,7 +119,7 @@ */ private boolean restoreFailedStorage = false; public void setRestoreFailedStorage(boolean val) { - LOG.info("enabled failed storage replicas restore"); + LOG.info("set restore failed storage to " + val); restoreFailedStorage=val; } Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Fri Feb 27 22:49:12 2009 @@ -3441,6 +3441,25 @@ getFSImage().saveFSImage(); LOG.info("New namespace image has been created."); } + + /** + * Enables/Disables/Checks restoring failed storage replicas if the storage becomes available again. + * Requires superuser privilege. + * + * @throws AccessControlException if superuser privilege is violated. + */ + synchronized boolean restoreFailedStorage(String arg) throws AccessControlException { + checkSuperuserPrivilege(); + + // if it is disabled - enable it and vice versa. + if(arg.equals("check")) + return getFSImage().getRestoreFailedStorage(); + + boolean val = arg.equals("true"); // false if not + getFSImage().setRestoreFailedStorage(val); + + return val; + } /** */ Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNode.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNode.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNode.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/NameNode.java Fri Feb 27 22:49:12 2009 @@ -47,6 +47,7 @@ import org.apache.hadoop.util.StringUtils; import org.apache.hadoop.net.NetUtils; import org.apache.hadoop.net.NetworkTopology; +import org.apache.hadoop.security.AccessControlException; import org.apache.hadoop.security.SecurityUtil; import org.apache.hadoop.security.UserGroupInformation; import org.apache.hadoop.security.authorize.AuthorizationException; @@ -609,6 +610,14 @@ } /** + * @throws AccessControlException + * @inheritDoc + */ + public boolean restoreFailedStorage(String arg) throws AccessControlException { + return namesystem.restoreFailedStorage(arg); + } + + /** * @inheritDoc */ public void saveNamespace() throws IOException { Modified: hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java (original) +++ hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/tools/DFSAdmin.java Fri Feb 27 22:49:12 2009 @@ -391,6 +391,33 @@ } /** + * Command to enable/disable/check restoring of failed storage replicas in the namenode. + * Usage: java DFSAdmin -restoreFailedStorage true|false|check + * @exception IOException + * @see org.apache.hadoop.hdfs.protocol.ClientProtocol#restoreFailedStorage() + */ + public int restoreFaileStorage(String arg) throws IOException { + int exitCode = -1; + + if (!(fs instanceof DistributedFileSystem)) { + System.err.println("FileSystem is " + fs.getUri()); + return exitCode; + } + + if(!arg.equals("check") && !arg.equals("true") && !arg.equals("false")) { + System.err.println("restoreFailedStorage valid args are true|false|check"); + return exitCode; + } + + DistributedFileSystem dfs = (DistributedFileSystem) fs; + Boolean res = dfs.restoreFailedStorage(arg); + System.out.println("restoreFailedStorage is set to " + res); + exitCode = 0; + + return exitCode; + } + + /** * Command to ask the namenode to reread the hosts and excluded hosts * file. * Usage: java DFSAdmin -refreshNodes @@ -416,6 +443,7 @@ "The full syntax is: \n\n" + "hadoop dfsadmin [-report] [-safemode ]\n" + "\t[-saveNamespace]\n" + + "\t[-restoreFailedStorage true|false|check]\n" + "\t[-refreshNodes]\n" + "\t[" + SetQuotaCommand.USAGE + "]\n" + "\t[" + ClearQuotaCommand.USAGE +"]\n" + @@ -440,6 +468,10 @@ "Save current namespace into storage directories and reset edits log.\n" + "\t\tRequires superuser permissions and safe mode.\n"; + String restoreFailedStorage = "-restoreFailedStorage:\t" + + "Set/Unset/Check flag to attempt restore of failed storage replicas if they become available.\n" + + "\t\tRequires superuser permissions.\n"; + String refreshNodes = "-refreshNodes: \tUpdates the set of hosts allowed " + "to connect to namenode.\n\n" + "\t\tRe-reads the config file to update values defined by \n" + @@ -480,6 +512,8 @@ System.out.println(safemode); } else if ("saveNamespace".equals(cmd)) { System.out.println(saveNamespace); + } else if ("restoreFailedStorage".equals(cmd)) { + System.out.println(restoreFailedStorage); } else if ("refreshNodes".equals(cmd)) { System.out.println(refreshNodes); } else if ("finalizeUpgrade".equals(cmd)) { @@ -505,6 +539,7 @@ System.out.println(report); System.out.println(safemode); System.out.println(saveNamespace); + System.out.println(restoreFailedStorage); System.out.println(refreshNodes); System.out.println(finalizeUpgrade); System.out.println(upgradeProgress); @@ -647,6 +682,9 @@ } else if ("-saveNamespace".equals(cmd)) { System.err.println("Usage: java DFSAdmin" + " [-saveNamespace]"); + } else if ("-restoreFailedStorage".equals(cmd)) { + System.err.println("Usage: java DFSAdmin" + + " [-restoreFailedStorage true|false|check ]"); } else if ("-refreshNodes".equals(cmd)) { System.err.println("Usage: java DFSAdmin" + " [-refreshNodes]"); @@ -679,6 +717,7 @@ System.err.println(" [-report]"); System.err.println(" [-safemode enter | leave | get | wait]"); System.err.println(" [-saveNamespace]"); + System.err.println(" [-restoreFailedStorage true|false|check]"); System.err.println(" [-refreshNodes]"); System.err.println(" [-finalizeUpgrade]"); System.err.println(" [-upgradeProgress status | details | force]"); @@ -729,6 +768,11 @@ printUsage(cmd); return exitCode; } + } else if ("-restoreFailedStorage".equals(cmd)) { + if (argv.length != 2) { + printUsage(cmd); + return exitCode; + } } else if ("-refreshNodes".equals(cmd)) { if (argv.length != 1) { printUsage(cmd); @@ -776,6 +820,8 @@ setSafeMode(argv, i); } else if ("-saveNamespace".equals(cmd)) { exitCode = saveNamespace(); + } else if ("-restoreFailedStorage".equals(cmd)) { + exitCode = restoreFaileStorage(argv[i]); } else if ("-refreshNodes".equals(cmd)) { exitCode = refreshNodes(); } else if ("-finalizeUpgrade".equals(cmd)) { Modified: hadoop/core/trunk/src/test/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/test/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java?rev=748728&r1=748727&r2=748728&view=diff ============================================================================== --- hadoop/core/trunk/src/test/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java (original) +++ hadoop/core/trunk/src/test/org/apache/hadoop/hdfs/server/namenode/TestStorageRestore.java Fri Feb 27 22:49:12 2009 @@ -20,7 +20,6 @@ import java.io.File; import java.io.IOException; -import java.util.Collection; import java.util.Iterator; import java.util.Random; @@ -28,6 +27,7 @@ import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.cli.util.CommandExecutor; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; @@ -38,6 +38,7 @@ import org.apache.hadoop.hdfs.server.common.Storage.StorageDirectory; import org.apache.hadoop.hdfs.server.namenode.FSImage.NameNodeDirType; import org.apache.hadoop.hdfs.server.namenode.FSImage.NameNodeFile; +import org.apache.hadoop.util.StringUtils; /** @@ -191,7 +192,6 @@ */ public void testStorageRestore() throws Exception { int numDatanodes = 2; - //Collection dirs = config.getStringCollection("dfs.name.dir"); cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, true, null, null, null, null); cluster.waitActive(); @@ -225,4 +225,55 @@ secondary.shutdown(); cluster.shutdown(); } + + /** + * Test dfsadmin -restoreFailedStorage command + * @throws Exception + */ + public void testDfsAdminCmd() throws IOException { + int numDatanodes = 2; + + + cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, true, null, null, null, null); + cluster.waitActive(); + try { + + FSImage fsi = cluster.getNameNode().getFSImage(); + + // it is started with dfs.name.dir.restore set to true (in SetUp()) + boolean restore = fsi.getRestoreFailedStorage(); + LOG.info("Restore is " + restore); + assertEquals(restore, true); + + // now run DFSAdmnin command + + String cmd = "-fs NAMENODE -restoreFailedStorage false"; + String namenode = config.get("fs.default.name", "file:///"); + CommandExecutor.executeDFSAdminCommand(cmd, namenode); + restore = fsi.getRestoreFailedStorage(); + LOG.info("After set true call restore is " + restore); + assertEquals(restore, false); + + // run one more time - to set it to true again + cmd = "-fs NAMENODE -restoreFailedStorage true"; + CommandExecutor.executeDFSAdminCommand(cmd, namenode); + restore = fsi.getRestoreFailedStorage(); + LOG.info("After set false call restore is " + restore); + assertEquals(restore, true); + + // run one more time - no change in value + cmd = "-fs NAMENODE -restoreFailedStorage check"; + CommandExecutor.executeDFSAdminCommand(cmd, namenode); + restore = fsi.getRestoreFailedStorage(); + LOG.info("After check call restore is " + restore); + assertEquals(restore, true); + String commandOutput = CommandExecutor.getLastCommandOutput(); + commandOutput.trim(); + assertTrue(commandOutput.contains("restoreFailedStorage is set to true")); + + + } finally { + cluster.shutdown(); + } + } }

-restoreFailedStorage true | false | checkThis option will turn on/off automatic attempt to restore failed storage replicas. + If a failed storage becomes available again the system will attempt to restore + edits and/or fsimage during checkpoint. 'check' option will return current setting.
-help [cmd] Displays help for the given command or all commands if none is specified.