Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 27967 invoked from network); 27 Jun 2008 01:37:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Jun 2008 01:37:46 -0000 Received: (qmail 31574 invoked by uid 500); 27 Jun 2008 01:37:47 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 31549 invoked by uid 500); 27 Jun 2008 01:37:47 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 31539 invoked by uid 99); 27 Jun 2008 01:37:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jun 2008 18:37:47 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jun 2008 01:37:05 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 09DA7238896B; Thu, 26 Jun 2008 18:36:55 -0700 (PDT) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r672072 - /hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml Date: Fri, 27 Jun 2008 01:36:54 -0000 To: core-commits@hadoop.apache.org From: cdouglas@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20080627013655.09DA7238896B@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: cdouglas Date: Thu Jun 26 18:36:54 2008 New Revision: 672072 URL: http://svn.apache.org/viewvc?rev=672072&view=rev Log: Checking in missing file from HADOOP-3552. Added: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml Added: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml?rev=672072&view=auto ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml (added) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml Thu Jun 26 18:36:54 2008 @@ -0,0 +1,611 @@ + + + + + +
+ Commands Manual +
+ + +
+ Overview +

+ All the hadoop commands are invoked by the bin/hadoop script. Running hadoop + script without any arguments prints the description for all commands. +

+

+ Usage: hadoop [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS] +

+

+ Hadoop has an option parsing framework that employs parsing generic options as well as running classes. +

+ + + + + + + + + + + + + + + +
COMMAND_OPTION Description
--config confdirOverwrites the default Configuration directory. Default is ${HADOOP_HOME}/conf.
GENERIC_OPTIONSThe common set of options supported by multiple commands.
COMMAND
COMMAND_OPTIONS
Various commands with their options are described in the following sections. The commands + have been grouped into User Commands + and Administration Commands.
+
+ Generic Options +

+ Following are supported by dfsadmin, + fs, fsck and + job. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GENERIC_OPTION Description
-conf <configuration file>Specify an application configuration file.
-D <property=value>Use value for given property.
-fs <local|namenode:port>Specify a namenode.
-jt <local|jobtracker:port>Specify a job tracker. Applies only to job.
-files <comma separated list of files>Specify comma separated files to be copied to the map reduce cluster. + Applies only to job.
-libjars <comma seperated list of jars>Specify comma separated jar files to include in the classpath. + Applies only to job.
-archives <comma separated list of archives>Specify comma separated archives to be unarchived on the compute machines. + Applies only to job.
+
+
+ +
+ User Commands +

Commands useful for users of a hadoop cluster.

+
+ archive +

+ Creates a hadoop archive. More information can be found at Hadoop Archives. +

+

+ Usage: hadoop archive -archiveName NAME <src>* <dest> +

+ + + + + + + + + + + + + + +
COMMAND_OPTION Description
-archiveName NAMEName of the archive to be created.
srcFilesystem pathnames which work as usual with regular expressions.
destDestination directory which would contain the archive.
+
+ +
+ distcp +

+ Copy file or directories recursively. More information can be found at DistCp Guide. +

+

+ Usage: hadoop distcp <srcurl> <desturl> +

+ + + + + + + + + + + +
COMMAND_OPTION Description
srcurlSource Url
desturlDestination Url
+
+ +
+ fs +

+ Usage: hadoop fs [GENERIC_OPTIONS] + [COMMAND_OPTIONS] +

+

+ Runs a generic filesystem user client. +

+

+ The various COMMAND_OPTIONS can be found at HDFS Shell Guide. +

+
+ +
+ fsck +

+ Runs a HDFS filesystem checking utility. See Fsck for more info. +

+

Usage: hadoop fsck [GENERIC_OPTIONS] + <path> [-move | -delete | -openforwrite] [-files [-blocks + [-locations | -racks]]]

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
COMMAND_OPTION Description
<path>Start checking from this path.
-moveMove corrupted files to /lost+found
-deleteDelete corrupted files.
-openforwritePrint out files opened for write.
-filesPrint out files being checked.
-blocksPrint out block report.
-locationsPrint out locations for every block.
-racksPrint out network topology for data-node locations.
+
+ +
+ jar +

+ Runs a jar file. Users can bundle their Map Reduce code in a jar file and execute it using this command. +

+

+ Usage: hadoop jar <jar> [mainClass] args... +

+

+ The streaming jobs are run via this command. Examples can be referred from + Streaming examples +

+

+ Word count example is also run using jar command. It can be referred from + Wordcount example +

+
+ +
+ job +

+ Command to interact with Map Reduce Jobs. +

+

+ Usage: hadoop job [GENERIC_OPTIONS] + [-submit <job-file>] | [-status <job-id>] | + [-counter <job-id> <group-name> <counter-name>] | [-kill <job-id>] | + [-events <job-id> <from-event-#> <#-of-events>] | [-history [all] <jobOutputDir>] | + [-list [all]] | [-kill-task <task-id>] | [-fail-task <task-id>] +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
COMMAND_OPTION Description
-submit <job-file>Submits the job.
-status <job-id>Prints the map and reduce completion percentage and all job counters.
-counter <job-id> <group-name> <counter-name>Prints the counter value.
-kill <job-id>Kills the job.
-events <job-id> <from-event-#> <#-of-events>Prints the events' details received by jobtracker for the given range.
-history [all] <jobOutputDir>-history <jobOutputDir> prints job details, failed and killed tip details. More details + about the job such as successful tasks and task attempts made for each task can be viewed by + specifying the [all] option.
-list [all]-list all displays all jobs. -list displays only jobs which are yet to complete.
-kill-task <task-id>Kills the task. Killed tasks are NOT counted against failed attempts.
-fail-task <task-id>Fails the task. Failed tasks are counted against failed attempts.
+
+ +
+ pipes +

+ Runs a pipes job. +

+

+ Usage: hadoop pipes [-conf <path>] [-jobconf <key=value>, <key=value>, ...] + [-input <path>] [-output <path>] [-jar <jar file>] [-inputformat <class>] + [-map <class>] [-partitioner <class>] [-reduce <class>] [-writer <class>] + [-program <executable>] [-reduces <num>] +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
COMMAND_OPTION Description
-conf <path>Configuration for job
-jobconf <key=value>, <key=value>, ...Add/override configuration for job
-input <path>Input directory
-output <path>Output directory
-jar <jar file>Jar filename
-inputformat <class>InputFormat class
-map <class>Java Map class
-partitioner <class>Java Partitioner
-reduce <class>Java Reduce class
-writer <class>Java RecordWriter
-program <executable>Executable URI
-reduces <num>Number of reduces
+
+ +
+ version +

+ Prints the version. +

+

+ Usage: hadoop version +

+
+ +
+ CLASSNAME +

+ hadoop script can be used to invoke any class. +

+

+ Usage: hadoop CLASSNAME +

+

+ Runs the class named CLASSNAME. +

+
+ +
+ +
+ Administration Commands +

Commands useful for administrators of a hadoop cluster.

+
+ balancer +

+ Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the + rebalancing process. See Rebalancer for more details. +

+

+ Usage: hadoop balancer [-threshold <threshold>] +

+ + + + + + + +
COMMAND_OPTION Description
-threshold <threshold>Percentage of disk capacity. This overwrites the default threshold.
+
+ +
+ daemonlog +

+ Get/Set the log level for each daemon. +

+

+ Usage: hadoop daemonlog -getlevel <host:port> <name>
+ Usage: hadoop daemonlog -setlevel <host:port> <name> <level> +

+ + + + + + + + + + + +
COMMAND_OPTION Description
-getlevel <host:port> <name>Prints the log level of the daemon running at <host:port>. + This command internally connects to http://<host:port>/logLevel?log=<name>
-setlevel <host:port> <name> <level>Sets the log level of the daemon running at <host:port>. + This command internally connects to http://<host:port>/logLevel?log=<name>
+
+ +
+ datanode +

+ Runs a HDFS datanode. +

+

+ Usage: hadoop datanode [-rollback] +

+ + + + + + + +
COMMAND_OPTION Description
-rollbackRollsback the datanode to the previous version. This should be used after stopping the datanode + and distributing the old hadoop version.
+
+ +
+ dfsadmin +

+ Runs a HDFS dfsadmin client. +

+

+ Usage: hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] + [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] + [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] + [-help [cmd]] +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
COMMAND_OPTION Description
-reportReports basic filesystem information and statistics.
-safemode enter | leave | get | waitSafe mode maintenance command. + Safe mode is a Namenode state in which it
+ 1. does not accept changes to the name space (read-only)
+ 2. does not replicate or delete blocks.
+ Safe mode is entered automatically at Namenode startup, and + leaves safe mode automatically when the configured minimum + percentage of blocks satisfies the minimum replication + condition. Safe mode can also be entered manually, but then + it can only be turned off manually as well.
-refreshNodesRe-read the hosts and exclude files to update the set + of Datanodes that are allowed to connect to the Namenode + and those that should be decommissioned or recommissioned.
-finalizeUpgradeFinalize upgrade of HDFS. + Datanodes delete their previous version working directories, + followed by Namenode doing the same. + This completes the upgrade process.
-upgradeProgress status | details | forceRequest current distributed upgrade status, + a detailed status or force the upgrade to proceed.
-metasave filenameSave Namenode's primary data structures + to <filename> in the directory specified by hadoop.log.dir property. + <filename> will contain one line for each of the following
+ 1. Datanodes heart beating with Namenode
+ 2. Blocks waiting to be replicated
+ 3. Blocks currrently being replicated
+ 4. Blocks waiting to be deleted
-setQuota <quota> <dirname>...<dirname>Set the quota <quota> for each directory <dirname>. + The directory quota is a long integer that puts a hard limit on the number of names in the directory tree.
+ Best effort for the directory, with faults reported if
+ 1. N is not a positive integer, or
+ 2. user is not an administrator, or
+ 3. the directory does not exist or is a file, or
+ 4. the directory would immediately exceed the new quota.
-clrQuota <dirname>...<dirname>Clear the quota for each directory <dirname>.
+ Best effort for the directory. with fault reported if
+ 1. the directory does not exist or is a file, or
+ 2. user is not an administrator.
+ It does not fault if the directory has no quota.
-help [cmd] Displays help for the given command or all commands if none + is specified.
+
+ +
+ jobtracker +

+ Runs the MapReduce job Tracker node. +

+

+ Usage: hadoop jobtracker +

+
+ +
+ namenode +

+ Runs the namenode. More info about the upgrade, rollback and finalize is at + Upgrade Rollback +

+

+ Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] +

+ + + + + + + + + + + + + + + + + + + + + + + +
COMMAND_OPTION Description
-formatFormats the namenode. It starts the namenode, formats it and then shut it down.
-upgradeNamenode should be started with upgrade option after the distribution of new hadoop version.
-rollbackRollsback the namenode to the previous version. This should be used after stopping the cluster + and distributing the old hadoop version.
-finalizeFinalize will remove the previous state of the files system. Recent upgrade will become permanent. + Rollback option will not be available anymore. After finalization it shuts the namenode down.
-importCheckpointLoads image from a checkpoint directory and save it into the current one. Checkpoint dir + is read from property fs.checkpoint.dir
+
+ +
+ secondarynamenode +

+ Runs the HDFS secondary namenode. See Secondary Namenode + for more info. +

+

+ Usage: hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize] +

+ + + + + + + + + + + +
COMMAND_OPTION Description
-checkpoint [force]Checkpoints the Secondary namenode if EditLog size >= fs.checkpoint.size. + If -force is used, checkpoint irrespective of EditLog size.
-geteditsizePrints the EditLog size.
+
+ +
+ tasktracker +

+ Runs a MapReduce task Tracker node. +

+

+ Usage: hadoop tasktracker +

+
+ +
+ + + + + +