hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yhema...@apache.org
Subject svn commit: r673334 [3/3] - in /hadoop/core/trunk: docs/ src/contrib/hod/ src/docs/src/documentation/content/xdocs/
Date Wed, 02 Jul 2008 09:47:26 GMT
Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml?rev=673334&r1=673333&r2=673334&view=diff
==============================================================================
--- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml (original)
+++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml Wed Jul
 2 02:47:25 2008
@@ -17,7 +17,8 @@
 <title>Overview</title>
 
 <p>The Hadoop On Demand (HOD) project is a system for provisioning and
-managing independent Hadoop MapReduce and HDFS instances on a shared cluster 
+managing independent Hadoop Map/Reduce and Hadoop Distributed File System (HDFS)
+instances on a shared cluster 
 of nodes. HOD is a tool that makes it easy for administrators and users to 
 quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers 
 and testers who need to share a physical cluster for testing their own Hadoop 
@@ -30,17 +31,17 @@
 </p>
 
 <p>
-The basic system architecture of HOD includes components from:</p>
+The basic system architecture of HOD includes these components:</p>
 <ul>
-  <li>A Resource manager (possibly together with a scheduler),</li>
-  <li>HOD components, and </li>
-  <li>Hadoop Map/Reduce and HDFS daemons.</li>
+  <li>A Resource manager (possibly together with a scheduler)</li>
+  <li>Various HOD components</li>
+  <li>Hadoop Map/Reduce and HDFS daemons</li>
 </ul>
 
 <p>
 HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances 
 through interaction with the above components on a given cluster of nodes. A cluster of
-nodes can be thought of as comprising of two sets of nodes:</p>
+nodes can be thought of as comprising two sets of nodes:</p>
 <ul>
   <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, and
then
 use the Hadoop client to submit Hadoop jobs. </li>
@@ -54,18 +55,18 @@
 </p>
 
 <ul>
-  <li>The user uses the HOD client on the Submit node to allocate a required number
of
-cluster nodes, and provision Hadoop on them.</li>
-  <li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to submit
a HOD
-process, called the RingMaster, as a Resource Manager job, requesting the user desired number

-of nodes. This job is submitted to the central server of the Resource Manager (pbs_server,
in Torque).</li>
-  <li>On the compute nodes, the resource manager slave daemons, (pbs_moms in Torque),
accept
-and run jobs that they are given by the central server (pbs_server in Torque). The RingMaster

+  <li>The user uses the HOD client on the Submit node to allocate a desired number
of
+cluster nodes and to provision Hadoop on them.</li>
+  <li>The HOD client uses a resource manager interface (qsub, in Torque) to submit
a HOD
+process, called the RingMaster, as a Resource Manager job, to request the user's desired
number 
+of nodes. This job is submitted to the central server of the resource manager (pbs_server,
in Torque).</li>
+  <li>On the compute nodes, the resource manager slave daemons (pbs_moms in Torque)
accept
+and run jobs that they are assigned by the central server (pbs_server in Torque). The RingMaster

 process is started on one of the compute nodes (mother superior, in Torque).</li>
-  <li>The Ringmaster then uses another Resource Manager interface, (pbsdsh, in Torque),
to run
+  <li>The RingMaster then uses another resource manager interface (pbsdsh, in Torque)
to run
 the second HOD component, HodRing, as distributed tasks on each of the compute
 nodes allocated.</li>
-  <li>The Hodrings, after initializing, communicate with the Ringmaster to get Hadoop
commands, 
+  <li>The HodRings, after initializing, communicate with the RingMaster to get Hadoop
commands, 
 and run them accordingly. Once the Hadoop commands are started, they register with the RingMaster,
 giving information about the daemons.</li>
   <li>All the configuration files needed for Hadoop instances are generated by HOD
itself, 
@@ -74,24 +75,25 @@
 JobTracker and HDFS daemons.</li>
 </ul>
 
-<p>The rest of the document deals with the steps needed to setup HOD on a physical
cluster of nodes.</p>
+<p>The rest of this document describes how to setup HOD on a physical cluster of nodes.</p>
 
 </section>
 
 <section>
 <title>Pre-requisites</title>
-
+<p>To use HOD, your system should include the following hardware and software
+components.</p>
 <p>Operating System: HOD is currently tested on RHEL4.<br/>
-Nodes : HOD requires a minimum of 3 nodes configured through a resource manager.<br/></p>
+Nodes : HOD requires a minimum of three nodes configured through a resource manager.<br/></p>
 
 <p> Software </p>
-<p>The following components are to be installed on *ALL* the nodes before using HOD:</p>
+<p>The following components must be installed on ALL nodes before using HOD:</p>
 <ul>
  <li>Torque: Resource manager</li>
  <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of
Python.</li>
 </ul>
 
-<p>The following components can be optionally installed for getting better
+<p>The following components are optional and can be installed to obtain better
 functionality from HOD:</p>
 <ul>
  <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
@@ -129,27 +131,27 @@
   href="ext:hod/torque-mailing-list">here</a>.
 </p>
 
-<p>For using HOD with Torque:</p>
+<p>To use HOD with Torque:</p>
 <ul>
- <li>Install Torque components: pbs_server on one node(head node), pbs_mom on all
+ <li>Install Torque components: pbs_server on one node (head node), pbs_mom on all
   compute nodes, and PBS client tools on all compute nodes and submit
-  nodes. Perform atleast a basic configuration so that the Torque system is up and
-  running i.e pbs_server knows which machines to talk to. Look <a
+  nodes. Perform at least a basic configuration so that the Torque system is up and
+  running, that is, pbs_server knows which machines to talk to. Look <a
   href="ext:hod/torque-basic-config">here</a>
   for basic configuration.
 
   For advanced configuration, see <a
   href="ext:hod/torque-advanced-config">here</a></li>
  <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is
the
-  same as the HOD configuration parameter, resource-manager.queue. The Hod client uses this
queue to
-  submit the Ringmaster process as a Torque job.</li>
- <li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
-  This can be done by using the 'qmgr' command. For example:
-  qmgr -c "set node node properties=cluster-name". The name of the cluster is the same as
+  same as the HOD configuration parameter, resource-manager.queue. The HOD client uses this
queue to
+  submit the RingMaster process as a Torque job.</li>
+ <li>Specify a cluster name as a property for all nodes in the cluster.
+  This can be done by using the qmgr command. For example:
+  <code>qmgr -c "set node node properties=cluster-name"</code>. The name of the
cluster is the same as
   the HOD configuration parameter, hod.cluster. </li>
- <li>Ensure that jobs can be submitted to the nodes. This can be done by
-  using the 'qsub' command. For example:
-  echo "sleep 30" | qsub -l nodes=3</li>
+ <li>Make sure that jobs can be submitted to the nodes. This can be done by
+  using the qsub command. For example:
+  <code>echo "sleep 30" | qsub -l nodes=3</code></li>
 </ul>
 
 </section>
@@ -157,14 +159,14 @@
 <section>
 <title>Installing HOD</title>
 
-<p>Now that the resource manager set up is done, we proceed on to obtaining and
-installing HOD.</p>
+<p>Once the resource manager is set up, you can obtain and
+install HOD.</p>
 <ul>
- <li>If you are getting HOD from the Hadoop tarball,it is available under the 
+ <li>If you are getting HOD from the Hadoop tarball, it is available under the 
   'contrib' section of Hadoop, under the root  directory 'hod'.</li>
  <li>If you are building from source, you can run ant tar from the Hadoop root
-  directory, to generate the Hadoop tarball, and then pick HOD from there,
-  as described in the point above.</li>
+  directory to generate the Hadoop tarball, and then get HOD from there,
+  as described above.</li>
  <li>Distribute the files under this directory to all the nodes in the
   cluster. Note that the location where the files are copied should be
   the same on all the nodes.</li>
@@ -176,14 +178,17 @@
 <section>
 <title>Configuring HOD</title>
 
-<p>After HOD installation is done, it has to be configured before we start using
-it.</p>
+<p>You can configure HOD once it is installed. The minimal configuration needed
+to run HOD is described below. More advanced configuration options are discussed
+in the HOD Configuration Guide.</p>
 <section>
-  <title>Minimal Configuration to get started</title>
+  <title>Minimal Configuration</title>
+  <p>To get started using HOD, the following minimal configuration is
+  required:</p>
 <ul>
- <li>On the node from where you want to run hod, edit the file hodrc
-  which can be found in the &lt;install dir&gt;/conf directory. This file
-  contains the minimal set of values required for running hod.</li>
+ <li>On the node from where you want to run HOD, edit the file hodrc
+  located in the &lt;install dir&gt;/conf directory. This file
+  contains the minimal set of values required to run hod.</li>
  <li>
 <p>Specify values suitable to your environment for the following
   variables defined in the configuration file. Note that some of these
@@ -196,7 +201,7 @@
     'node property' as mentioned in resource manager configuration.</li>
    <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
     submit nodes.</li>
-   <li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
+   <li>${RM_QUEUE}: Queue configured for submitting jobs in the resource
     manager configuration.</li>
    <li>${RM_HOME}: Location of the resource manager installation on the
     compute and submit nodes.</li>
@@ -204,15 +209,15 @@
 </li>
 
 <li>
-<p>The following environment variables *may* need to be set depending on
+<p>The following environment variables may need to be set depending on
   your environment. These variables must be defined where you run the
-  HOD client, and also be specified in the HOD configuration file as the
+  HOD client and must also be specified in the HOD configuration file as the
   value of the key resource_manager.env-vars. Multiple variables can be
   specified as a comma separated list of key=value pairs.</p>
 
   <ul>
    <li>HOD_PYTHON_HOME: If you install python to a non-default location
-    of the compute nodes, or submit nodes, then, this variable must be
+    of the compute nodes, or submit nodes, then this variable must be
     defined to point to the python executable in the non-standard
     location.</li>
     </ul>
@@ -222,38 +227,38 @@
 
   <section>
     <title>Advanced Configuration</title>
-    <p> You can review other configuration options in the file and modify them to suit
- your needs. Refer to the <a href="hod_config_guide.html">Configuration Guide</a>
for information about the HOD
- configuration.
-    </p>
+    <p> You can review and modify other configuration options to suit
+ your specific needs. Refer to the <a href="hod_config_guide.html">Configuration
+ Guide</a> for more information.</p>
   </section>
 </section>
 
   <section>
     <title>Running HOD</title>
-    <p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a>
for information about how to run HOD,
-    what are the various features, options and for help in trouble-shooting.</p>
+    <p>You can run HOD once it is configured. Refer to <a
+    href="hod_user_guide.html">the HOD User Guide</a> for more information.</p>
   </section>
 
   <section>
     <title>Supporting Tools and Utilities</title>
-    <p>This section describes certain supporting tools and utilities that can be used
in managing HOD deployments.</p>
+    <p>This section describes supporting tools and utilities that can be used to
+    manage HOD deployments.</p>
     
     <section>
-      <title>logcondense.py - Tool for removing log files uploaded to DFS</title>
-      <p>As mentioned in 
-         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">this section</a>
of the
-         <a href="hod_user_guide.html">HOD User Guide</a>, HOD can be configured
to upload
+      <title>logcondense.py - Manage Log Files</title>
+      <p>As mentioned in the 
+         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">HOD User
Guide</a>,
+         HOD can be configured to upload
          Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
-         to DFS could increase. logcondense.py is a tool that helps administrators to clean-up
-         the log files older than a certain number of days. </p>
+         to HDFS could increase. logcondense.py is a tool that helps
+         administrators to remove log files uploaded to HDFS. </p>
       <section>
         <title>Running logcondense.py</title>
         <p>logcondense.py is available under hod_install_location/support folder. You
can either
-        run it using python, for e.g. <em>python logcondense.py</em>, or give
execute permissions 
+        run it using python, for example, <em>python logcondense.py</em>, or
give execute permissions 
         to the file, and directly run it as <em>logcondense.py</em>. logcondense.py
needs to be 
         run by a user who has sufficient permissions to remove files from locations where
log 
-        files are uploaded in the DFS, if permissions are enabled. For e.g. as mentioned
in the
+        files are uploaded in the HDFS, if permissions are enabled. For example as mentioned
in the
         <a href="hod_config_guide.html#3.7+hodring+options">configuration guide</a>,
the logs could
         be configured to come under the user's home directory in HDFS. In that case, the
user
         running logcondense.py should have super user privileges to remove the files from
under
@@ -302,8 +307,9 @@
               <td>--dynamicdfs</td>
               <td>If true, this will indicate that the logcondense.py script should
delete HDFS logs
               in addition to Map/Reduce logs. Otherwise, it only deletes Map/Reduce logs,
which is also the
-              default if this option is not specified. This option is useful if dynamic DFS
installations 
-              are being provisioned by HOD, and the static DFS installation is being used
only to collect 
+              default if this option is not specified. This option is useful if
+              dynamic HDFS installations 
+              are being provisioned by HOD, and the static HDFS installation is being used
only to collect 
               logs - a scenario that may be common in test clusters.</td>
               <td>false</td>
             </tr>
@@ -314,14 +320,15 @@
       </section>
     </section>
     <section>
-      <title>checklimits.sh - Tool to update torque comment field reflecting resource
limits</title>
-      <p>checklimits is a HOD tool specific to Torque/Maui environment
+      <title>checklimits.sh - Monitor Resource Limits</title>
+      <p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
       (<a href="ext:hod/maui">Maui Cluster Scheduler</a> is an open source job
       scheduler for clusters and supercomputers, from clusterresources). The
       checklimits.sh script
-      updates torque comment field when newly submitted job(s) violate/cross
+      updates the torque comment field when newly submitted job(s) violate or
+      exceed
       over user limits set up in Maui scheduler. It uses qstat, does one pass
-      over torque job list to find out queued or unfinished jobs, runs Maui
+      over the torque job-list to determine queued or unfinished jobs, runs Maui
       tool checkjob on each job to see if user limits are violated and then
       runs torque's qalter utility to update job attribute 'comment'. Currently
       it updates the comment as <em>User-limits exceeded. Requested:([0-9]*)
@@ -330,16 +337,16 @@
       the type of violation.</p>
       <section>
         <title>Running checklimits.sh</title>
-        <p>checklimits.sh is available under hod_install_location/support
-        folder. This is a shell script and can be run directly as <em>sh
+        <p>checklimits.sh is available under the hod_install_location/support
+        folder. This shell script can be run directly as <em>sh
         checklimits.sh </em>or as <em>./checklimits.sh</em> after enabling
         execute permissions. Torque and Maui binaries should be available
         on the machine where the tool is run and should be in the path
-        of the shell script process. In order for this tool to be able to update
-        comment field of jobs from different users, it has to be run with
-        torque administrative privileges. This tool has to be run repeatedly
+        of the shell script process. To update the
+        comment field of jobs from different users, this tool must be run with
+        torque administrative privileges. This tool must be run repeatedly
         after specific intervals of time to frequently update jobs violating
-        constraints, for e.g. via cron. Please note that the resource manager
+        constraints, for example via cron. Please note that the resource manager
         and scheduler commands used in this script can be expensive and so
         it is better not to run this inside a tight loop without sleeping.</p>
       </section>

Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml?rev=673334&r1=673333&r2=673334&view=diff
==============================================================================
--- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml (original)
+++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml Wed Jul
 2 02:47:25 2008
@@ -16,26 +16,26 @@
     <section>
       <title>1. Introduction</title>
     
-      <p>Configuration options for HOD are organized as sections and options 
-      within them. They can be specified in two ways: a configuration file 
+      <p>This document explains some of the most important and commonly used 
+      Hadoop On Demand (HOD) configuration options. Configuration options 
+      can be specified in two ways: a configuration file 
       in the INI format, and as command line options to the HOD shell, 
       specified in the format --section.option[=value]. If the same option is 
       specified in both places, the value specified on the command line 
       overrides the value in the configuration file.</p>
       
       <p>
-        To get a simple description of all configuration options, you can type
+        To get a simple description of all configuration options, type:
       </p>
       <table><tr><td><code>$ hod --verbose-help</code></td></tr></table>
       
-      <p>This document explains some of the most important or commonly used
-      configuration options in some more detail.</p>
+
     </section>
     
     <section>
       <title>2. Sections</title>
     
-      <p>The following are the various sections in the HOD configuration:</p>
+      <p>HOD organizes configuration options into these sections:</p>
       
       <ul>
         <li>  hod:                  Options for the HOD client</li>
@@ -43,19 +43,19 @@
          to use, and other parameters for using that resource manager</li>
         <li>  ringmaster:           Options for the RingMaster process, </li>
         <li>  hodring:              Options for the HodRing processes</li>
-        <li>  gridservice-mapred:   Options for the MapReduce daemons</li>
+        <li>  gridservice-mapred:   Options for the Map/Reduce daemons</li>
         <li>  gridservice-hdfs:     Options for the HDFS daemons.</li>
       </ul>
     
-      
-      <p>The next section deals with some of the important options in the HOD 
-        configuration.</p>
     </section>
     
     <section>
-      <title>3. Important / Commonly Used Configuration Options</title>
-  
+      <title>3. HOD Configuration Options</title>
   
+      <p>The following section describes configuration options common to most 
+      HOD sections followed by sections that describe configuration options 
+      specific to each HOD section.</p>
+      
       <section> 
         <title>3.1 Common configuration options</title>
         
@@ -70,7 +70,7 @@
                       sure that the users who will run hod have rights to create 
                       directories under the directory specified here.</li>
           
-          <li>debug: A numeric value from 1-4. 4 produces the most log information,
+          <li>debug: Numeric value from 1-4. 4 produces the most log information,
                    and 1 the least.</li>
           
           <li>log-dir: Directory where log files are stored. By default, this is
@@ -78,10 +78,10 @@
                      temp-dir variable apply here too.
           </li>
           
-          <li>xrs-port-range: A range of ports, among which an available port shall
+          <li>xrs-port-range: Range of ports, among which an available port shall
                             be picked for use to run an XML-RPC server.</li>
           
-          <li>http-port-range: A range of ports, among which an available port shall
+          <li>http-port-range: Range of ports, among which an available port shall
                              be picked for use to run an HTTP server.</li>
           
           <li>java-home: Location of Java to be used by Hadoop.</li>
@@ -96,15 +96,15 @@
         <title>3.2 hod options</title>
         
         <ul>
-          <li>cluster: A descriptive name given to the cluster. For Torque, this is
+          <li>cluster: Descriptive name given to the cluster. For Torque, this is
                      specified as a 'Node property' for every node in the cluster.
                      HOD uses this value to compute the number of available nodes.</li>
           
-          <li>client-params: A comma-separated list of hadoop config parameters
+          <li>client-params: Comma-separated list of hadoop config parameters
                            specified as key-value pairs. These will be used to
                            generate a hadoop-site.xml on the submit node that 
-                           should be used for running MapReduce jobs.</li>
-          <li>job-feasibility-attr: A regular expression string that specifies
+                           should be used for running Map/Reduce jobs.</li>
+          <li>job-feasibility-attr: Regular expression string that specifies
                            whether and how to check job feasibility - resource
                            manager or scheduler limits. The current
                            implementation corresponds to the torque job
@@ -113,16 +113,16 @@
                            of limit violation is triggered and either
                            deallocates the cluster or stays in queued state
                            according as the request is beyond maximum limits or
-                           the cumulative usage has crossed maxumum limits. 
+                           the cumulative usage has crossed maximum limits. 
                            The torque comment attribute may be updated
-                           periodically by an external mechanism. For e.g.,
+                           periodically by an external mechanism. For example,
                            comment attribute can be updated by running <a href=
 "hod_admin_guide.html#checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits">
                            checklimits.sh</a> script in hod/support directory,
                            and then setting job-feasibility-attr equal to the
-                           value TORQUE_USER_LIMITS_COMMENT_FIELD i.e
+                           value TORQUE_USER_LIMITS_COMMENT_FIELD,
                            "User-limits exceeded. Requested:([0-9]*)
-                           Used:([0-9]*) MaxLimit:([0-9]*)" will make HOD
+                           Used:([0-9]*) MaxLimit:([0-9]*)", will make HOD
                            behave accordingly.
                            </li>
          </ul>
@@ -139,7 +139,7 @@
                         which the executables of the resource manager can be 
                         found.</li> 
           
-          <li>env-vars: This is a comma separated list of key-value pairs, 
+          <li>env-vars: Comma-separated list of key-value pairs, 
                       expressed as key=value, which would be passed to the jobs 
                       launched on the compute nodes. 
                       For example, if the python installation is 
@@ -154,18 +154,18 @@
         <title>3.4 ringmaster options</title>
         
         <ul>
-          <li>work-dirs: These are a list of comma separated paths that will serve
+          <li>work-dirs: Comma-separated list of paths that will serve
                        as the root for directories that HOD generates and passes
-                       to Hadoop for use to store DFS / MapReduce data. For e.g.
+                       to Hadoop for use to store DFS and Map/Reduce data. For e.g.
                        this is where DFS data blocks will be stored. Typically,
                        as many paths are specified as there are disks available
                        to ensure all disks are being utilized. The restrictions
                        and notes for the temp-dir variable apply here too.</li>
-          <li>max-master-failures: It defines how many times a hadoop master
+          <li>max-master-failures: Number of times a hadoop master
                        daemon can fail to launch, beyond which HOD will fail
                        the cluster allocation altogether. In HOD clusters,
                        sometimes there might be a single or few "bad" nodes due
-                       to issues like missing java, missing/incorrect version
+                       to issues like missing java, missing or incorrect version
                        of Hadoop etc. When this configuration variable is set
                        to a positive integer, the RingMaster returns an error
                        to the client only when the number of times a hadoop
@@ -184,7 +184,7 @@
         <title>3.5 gridservice-hdfs options</title>
         
         <ul>
-          <li>external: If false, this indicates that a HDFS cluster must be 
+          <li>external: If false, indicates that a HDFS cluster must be 
                       bought up by the HOD system, on the nodes which it 
                       allocates via the allocate command. Note that in that case,
                       when the cluster is de-allocated, it will bring down the 
@@ -207,7 +207,7 @@
                   located. This can be used to use a pre-installed version of
                   Hadoop on the cluster.</li>
           
-          <li>server-params: A comma-separated list of hadoop config parameters
+          <li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            NameNode and DataNodes.</li>
@@ -220,11 +220,11 @@
         <title>3.6 gridservice-mapred options</title>
         
         <ul>
-          <li>external: If false, this indicates that a MapReduce cluster must be
+          <li>external: If false, indicates that a Map/Reduce cluster must be
                       bought up by the HOD system on the nodes which it allocates
                       via the allocate command.
                       If true, if will try and connect to an externally 
-                      configured MapReduce system.</li>
+                      configured Map/Reduce system.</li>
           
           <li>host: Hostname of the externally configured JobTracker, if any</li>
           
@@ -235,7 +235,7 @@
           <li>pkgs: Installation directory, under which bin/hadoop executable is 
                   located</li>
           
-          <li>server-params: A comma-separated list of hadoop config parameters
+          <li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            JobTracker and TaskTrackers</li>
@@ -266,8 +266,8 @@
                                    cluster node's local file path, use the format 'file://path'.
 
                                    When clusters are deallocated by HOD, the hadoop logs
will
-                                   be deleted as part of HOD's cleanup process. In order
to
-                                   persist these logs, you can use this configuration option.
+                                   be deleted as part of HOD's cleanup process. To ensure
these
+                                   logs persist, you can use this configuration option.
 
                                    The format of the path is 
                                    value-of-this-option/userid/hod-logs/cluster-id

Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml?rev=673334&r1=673333&r2=673334&view=diff
==============================================================================
--- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml (original)
+++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml Wed Jul
 2 02:47:25 2008
@@ -14,7 +14,7 @@
     <title> Introduction </title><anchor id="Introduction"></anchor>
   <p>Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters over
a large physical cluster. It uses the Torque resource manager to do node allocation. On the
allocated nodes, it can start Hadoop Map/Reduce and HDFS daemons. It automatically generates
the appropriate configuration files (hadoop-site.xml) for the Hadoop daemons and client. HOD
also has the capability to distribute Hadoop to the nodes in the virtual cluster that it allocates.
In short, HOD makes it easy for administrators and users to quickly setup and use Hadoop.
It is also a very useful tool for Hadoop developers and testers who need to share a physical
cluster for testing their own Hadoop versions.</p>
   <p>HOD supports Hadoop from version 0.15 onwards.</p>
-  <p>The rest of the documentation comprises of a quick-start guide that helps you
get quickly started with using HOD, a more detailed guide of all HOD features, command line
options, known issues and trouble-shooting information.</p>
+  <p>The rest of this document comprises of a quick-start guide that helps you get
quickly started with using HOD, a more detailed guide of all HOD features, and a trouble-shooting
section.</p>
   </section>
   <section>
 		<title> Getting Started Using HOD </title><anchor id="Getting_Started_Using_HOD_0_4"></anchor>
@@ -110,7 +110,7 @@
   <section><title> Provisioning and Managing Hadoop Clusters </title><anchor
id="Provisioning_and_Managing_Hadoop"></anchor>
   <p>The primary feature of HOD is to provision Hadoop Map/Reduce and HDFS clusters.
This is described above in the Getting Started section. Also, as long as nodes are available,
and organizational policies allow, a user can use HOD to allocate multiple Map/Reduce clusters
simultaneously. The user would need to specify different paths for the <code>cluster_dir</code>
parameter mentioned above for each cluster he/she allocates. HOD provides the <em>list</em>
and the <em>info</em> operations to enable managing multiple clusters.</p>
   <p><strong> Operation <em>list</em></strong></p><anchor
id="Operation_list"></anchor>
-  <p>The list operation lists all the clusters allocated so far by a user. The cluster
directory where the hadoop-site.xml is stored for the cluster, and it's status vis-a-vis connectivity
with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
+  <p>The list operation lists all the clusters allocated so far by a user. The cluster
directory where the hadoop-site.xml is stored for the cluster, and its status vis-a-vis connectivity
with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
     <table>
       
         <tr>
@@ -219,7 +219,7 @@
    <table><tr><td><code>log-destination-uri = hdfs://host123:45678/user/hod/logs</code>
or</td></tr>
     <tr><td><code>log-destination-uri = file://path/to/store/log/files</code></td></tr>
     </table>
-  <p>Under the root directory specified above in the path, HOD will create a create
a path user_name/torque_jobid and store gzipped log files for each node that was part of the
job.</p>
+  <p>Under the root directory specified above in the path, HOD will create a path user_name/torque_jobid
and store gzipped log files for each node that was part of the job.</p>
   <p>Note that to store the files to HDFS, you may need to configure the <code>hodring.pkgs</code>
option with the Hadoop version that matches the HDFS mentioned. If not, HOD will try to use
the Hadoop version that it is using to provision the Hadoop cluster itself.</p>
   </section>
   <section><title> Auto-deallocation of Idle Clusters </title><anchor
id="Auto_deallocation_of_Idle_Cluste"></anchor>
@@ -242,7 +242,7 @@
           <td><code>$ hod allocate -d cluster_dir -n number_of_nodes -N name_of_job</code></td>
         </tr>
     </table>
-  <p><em>Note:</em> Due to restriction in the underlying Torque resource
manager, names which do not start with a alphabet or contain a 'space' will cause the job
to fail. The failure message points to the problem being in the specified job name.</p>
+  <p><em>Note:</em> Due to restriction in the underlying Torque resource
manager, names which do not start with an alphabet character or contain a 'space' will cause
the job to fail. The failure message points to the problem being in the specified job name.</p>
   </section>
   <section><title> Capturing HOD exit codes in Torque </title><anchor
id="Capturing_HOD_exit_codes_in_Torq"></anchor>
   <p>HOD exit codes are captured in the Torque exit_status field. This will help users
and system administrators to distinguish successful runs from unsuccessful runs of HOD. The
exit codes are 0 if allocation succeeded and all hadoop jobs ran on the allocated cluster
correctly. They are non-zero if allocation failed or some of the hadoop jobs failed on the
allocated cluster. The exit codes that are possible are mentioned in the table below. <em>Note:
Hadoop job status is captured only if the version of Hadoop used is 16 or above.</em></p>



Mime
View raw message