hadoop-mapreduce-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From omal...@apache.org
Subject svn commit: r816439 [1/3] - in /hadoop/mapreduce/trunk: ./ src/docs/cn/ src/docs/src/documentation/ src/docs/src/documentation/content/xdocs/ src/docs/src/documentation/resources/images/
Date Fri, 18 Sep 2009 02:20:49 GMT
Author: omalley
Date: Fri Sep 18 02:20:48 2009
New Revision: 816439

URL: http://svn.apache.org/viewvc?rev=816439&view=rev
Log:
MAPREDUCE-916. Split the documentation to match the project split.
(Corinne Chandel via omalley)

Added:
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hod_scheduler.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/resources/images/mapreduce-logo.jpg   (with props)
Removed:
    hadoop/mapreduce/trunk/src/docs/cn/
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/SLG_user_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_design.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_shell.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/libhdfs.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/quickstart.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml
Modified:
    hadoop/mapreduce/trunk/CHANGES.txt
    hadoop/mapreduce/trunk/build.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/distcp.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/index.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/site.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/streaming.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/tabs.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/vaidya.xml
    hadoop/mapreduce/trunk/src/docs/src/documentation/skinconf.xml

Modified: hadoop/mapreduce/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/CHANGES.txt?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/CHANGES.txt (original)
+++ hadoop/mapreduce/trunk/CHANGES.txt Fri Sep 18 02:20:48 2009
@@ -358,6 +358,9 @@
     MAPREDUCE-907. Sqoop should use more intelligent splits. (Aaron Kimball
     via tomwhite)
 
+    MAPREDUCE-916. Split the documentation to match the project split.
+    (Corinne Chandel via omalley)
+
   BUG FIXES
 
     MAPREDUCE-878. Rename fair scheduler design doc to 

Modified: hadoop/mapreduce/trunk/build.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/build.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/build.xml (original)
+++ hadoop/mapreduce/trunk/build.xml Fri Sep 18 02:20:48 2009
@@ -44,7 +44,6 @@
   <property name="conf.dir" value="${basedir}/conf"/>
   <property name="contrib.dir" value="${basedir}/src/contrib"/>
   <property name="docs.src" value="${basedir}/src/docs"/>
-  <property name="src.docs.cn" value="${basedir}/src/docs/cn"/>
   <property name="changes.src" value="${docs.src}/changes"/>
   <property name="c++.src" value="${basedir}/src/c++"/>
   <property name="c++.utils.src" value="${c++.src}/utils"/>
@@ -79,7 +78,6 @@
   <property name="build.c++.examples.pipes" 
             value="${build.c++}/examples/pipes"/>
   <property name="build.docs" value="${build.dir}/docs"/>
-  <property name="build.docs.cn" value="${build.dir}/docs/cn"/>
   <property name="build.javadoc" value="${build.docs}/api"/>
   <property name="build.javadoc.timestamp" value="${build.javadoc}/index.html" />
   <property name="build.javadoc.dev" value="${build.docs}/dev-api"/>
@@ -750,22 +748,6 @@
     <style basedir="${mapred.src.dir}" destdir="${build.docs}"
            includes="mapred-default.xml" style="conf/configuration.xsl"/>
     <antcall target="changes-to-html"/>
-    <antcall target="cn-docs"/>
-  </target>
-
-  <target name="cn-docs" depends="forrest.check, init" 
-       description="Generate forrest-based Chinese documentation. To use, specify -Dforrest.home=&lt;base of Apache Forrest installation&gt; on the command line." 
-        if="forrest.home">
-    <exec dir="${src.docs.cn}" executable="${forrest.home}/bin/forrest" failonerror="true">
-      <env key="LANG" value="en_US.utf8"/>
-      <env key="JAVA_HOME" value="${java5.home}"/>
-    </exec>
-    <copy todir="${build.docs.cn}">
-      <fileset dir="${src.docs.cn}/build/site/" />
-    </copy>
-    <style basedir="${mapred.src.dir}" destdir="${build.docs.cn}"
-           includes="mapred-default.xml" style="conf/configuration.xsl"/>
-    <antcall target="changes-to-html"/>
   </target>
 
   <target name="forrest.check" unless="forrest.home" depends="java5.check">
@@ -1133,7 +1115,6 @@
   <target name="clean" depends="clean-contrib" description="Clean.  Delete the build files, and their directories">
     <delete dir="${build.dir}"/>
     <delete dir="${docs.src}/build"/>
-    <delete dir="${src.docs.cn}/build"/>
   </target>
 
   <!-- ================================================================== -->

Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml Fri Sep 18 02:20:48 2009
@@ -21,7 +21,7 @@
 <document>
   
   <header>
-    <title>Capacity Scheduler Guide</title>
+    <title>Capacity Scheduler</title>
   </header>
   
   <body>
@@ -30,7 +30,7 @@
       <title>Purpose</title>
       
       <p>This document describes the Capacity Scheduler, a pluggable 
-      Map/Reduce scheduler for Hadoop which provides a way to share 
+      MapReduce scheduler for Hadoop which provides a way to share 
       large clusters.</p>
     </section>
     
@@ -40,7 +40,7 @@
       <p>The Capacity Scheduler supports the following features:</p> 
       <ul>
         <li>
-          Support for multiple queues, where a job is submitted to a queue.
+          Multiple queues, where a job is submitted to a queue.
         </li>
         <li>
           Queues are allocated a fraction of the capacity of the grid in the 
@@ -81,7 +81,7 @@
     </section>
     
     <section>
-      <title>Picking a task to run</title>
+      <title>Picking a Task to Run</title>
       
       <p>Note that many of these steps can be, and will be, enhanced over time
       to provide better algorithms.</p>
@@ -131,8 +131,8 @@
           the following property in the site configuration:</p>
           <table>
             <tr>
-              <td>Property</td>
-              <td>Value</td>
+              <th>Name</th>
+              <th>Value</th>
             </tr>
             <tr>
               <td>mapred.jobtracker.taskScheduler</td>
@@ -142,7 +142,7 @@
       </section>
 
       <section>
-        <title>Setting up queues</title>
+        <title>Setting Up Queues</title>
         <p>
           You can define multiple queues to which users can submit jobs with
           the Capacity Scheduler. To define multiple queues, you should edit
@@ -154,14 +154,13 @@
           have access to the queues.
         </p>
         <p>
-          For more details, refer to
-          <a href="cluster_setup.html#Configuring+the+Hadoop+Daemons">Cluster 
-          Setup</a> documentation.
+          For more details, see
+          <a href="http://hadoop.apache.org/common/docs/current/cluster_setup.html#Configuring+the+Hadoop+Daemons">Configuring the Hadoop Daemons</a>.
         </p>
       </section>
   
       <section>
-        <title>Configuring properties for queues</title>
+        <title>Configuring Properties for Queues</title>
 
         <p>The Capacity Scheduler can be configured with several properties
         for each queue that control the behavior of the Scheduler. This
@@ -183,16 +182,16 @@
 
         <table>
           <tr><th>Name</th><th>Description</th></tr>
-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.capacity</td>
+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.capacity</td>
           	<td>Percentage of the number of slots in the cluster that are made 
             to be available for jobs in this queue. The sum of capacities 
             for all queues should be less than or equal 100.</td>
           </tr>
-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.supports-priority</td>
+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.supports-priority</td>
           	<td>If true, priorities of jobs will be taken into account in scheduling 
           	decisions.</td>
           </tr>
-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.minimum-user-limit-percent</td>
+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.minimum-user-limit-percent</td>
           	<td>Each queue enforces a limit on the percentage of resources 
           	allocated to a user at any given time, if there is competition 
           	for them. This user limit can vary between a minimum and maximum 
@@ -205,7 +204,7 @@
           	users, no user can use more than 25% of the queue's resources. A 
           	value of 100 implies no user limits are imposed.</td>
           </tr>
-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.max.map.slots</td>
+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.max.map.slots</td>
           	<td>
 		    This value is the maximum max slots that can be used in a
 		    queue at any point of time. So for example assuming above config value
@@ -221,7 +220,7 @@
 		    implementation
                 </td>
           </tr>
-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.max.reduce.slots</td>
+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.max.reduce.slots</td>
           	<td>
 		    This value is the maximum reduce slots that can be used in a
 		    queue at any point of time. So for example assuming above config value
@@ -241,16 +240,15 @@
       </section>
       
       <section>
-        <title>Memory management</title>
+        <title>Memory Management</title>
       
         <p>The Capacity Scheduler supports scheduling of tasks on a
         <code>TaskTracker</code>(TT) based on a job's memory requirements
         and the availability of RAM and Virtual Memory (VMEM) on the TT node.
-        See the <a href="mapred_tutorial.html#Memory+monitoring">Hadoop 
-        Map/Reduce tutorial</a> for details on how the TT monitors
-        memory usage.</p>
-        <p>Currently the memory based scheduling is only supported
-        in Linux platform.</p>
+        See the 
+        <a href="mapred_tutorial.html">MapReduce Tutorial</a> 
+        for details on how the TT monitors memory usage.</p>
+        <p>Currently the memory based scheduling is only supported in Linux platform.</p>
         <p>Memory-based scheduling works as follows:</p>
         <ol>
           <li>The absence of any one or more of three config parameters 
@@ -260,8 +258,8 @@
           <code>mapred.task.limit.maxvmem</code>, disables memory-based
           scheduling, just as it disables memory monitoring for a TT. These
           config parameters are described in the 
-          <a href="mapred_tutorial.html#Memory+monitoring">Hadoop Map/Reduce 
-          tutorial</a>. The value of  
+          <a href="mapred_tutorial.html">MapReduce Tutorial</a>. 
+          The value of  
           <code>mapred.tasktracker.vmem.reserved</code> is 
           obtained from the TT via its heartbeat. 
           </li>
@@ -286,7 +284,7 @@
           set, the Scheduler computes the available RAM on the node. Next, 
           the Scheduler figures out the RAM requirements of the job, if any. 
           As with VMEM, users can optionally specify a RAM limit for their job
-          (<code>mapred.task.maxpmem</code>, described in the Map/Reduce 
+          (<code>mapred.task.maxpmem</code>, described in the MapReduce 
           tutorial). The Scheduler also maintains a limit for this value 
           (<code>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</code>, 
           described below). All these three values must be set for the 
@@ -303,7 +301,7 @@
 
         <table>
           <tr><th>Name</th><th>Description</th></tr>
-          <tr><td>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</td>
+          <tr><td>mapred.capacity-scheduler.task.default-pmem-<br/>percentage-in-vmem</td>
           	<td>A percentage of the default VMEM limit for jobs
           	(<code>mapred.task.default.maxvmem</code>). This is the default 
           	RAM task-limit associated with a task. Unless overridden by a 
@@ -323,14 +321,14 @@
         scheduled, for reducing the memory footprint on jobtracker. 
         Following are the parameters, by which you can control the laziness
         of the job initialization. The following parameters can be 
-        configured in capacity-scheduler.xml
+        configured in capacity-scheduler.xml:
         </p>
         
         <table>
           <tr><th>Name</th><th>Description</th></tr>
           <tr>
             <td>
-              mapred.capacity-scheduler.queue.&lt;queue-name&gt;.maximum-initialized-jobs-per-user
+              mapred.capacity-scheduler.queue.&lt;queue-<br/>name&gt;.maximum-initialized-jobs-per-user
             </td>
             <td>
               Maximum number of jobs which are allowed to be pre-initialized for
@@ -367,13 +365,13 @@
         </table>
       </section>   
       <section>
-        <title>Reviewing the configuration of the Capacity Scheduler</title>
+        <title>Reviewing the Configuration of the Capacity Scheduler</title>
         <p>
           Once the installation and configuration is completed, you can review
-          it after starting the Map/Reduce cluster from the admin UI.
+          it after starting the MapReduce cluster from the admin UI.
         </p>
         <ul>
-          <li>Start the Map/Reduce cluster as usual.</li>
+          <li>Start the MapReduce cluster as usual.</li>
           <li>Open the JobTracker web UI.</li>
           <li>The queues you have configured should be listed under the <em>Scheduling
               Information</em> section of the page.</li>

Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml Fri Sep 18 02:20:48 2009
@@ -19,14 +19,14 @@
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
 <document>
 	<header>
-		<title>Commands Guide</title>
+		<title>Hadoop Commands Guide</title>
 	</header>
 	
 	<body>
 		<section>
 			<title>Overview</title>
 			<p>
-				All hadoop commands are invoked by the bin/hadoop script. Running the hadoop
+				All Hadoop commands are invoked by the bin/hadoop script. Running the Hadoop
 				script without any arguments prints the description for all commands.
 			</p>
 			<p>
@@ -104,11 +104,11 @@
 		
 		<section>
 			<title> User Commands </title>
-			<p>Commands useful for users of a hadoop cluster.</p>
+			<p>Commands useful for users of a Hadoop cluster.</p>
 			<section>
 				<title> archive </title>
 				<p>
-					Creates a hadoop archive. More information can be found at <a href="hadoop_archives.html">Hadoop Archives</a>.
+					Creates a Hadoop archive. More information see the <a href="hadoop_archives.html">Hadoop Archives Guide</a>.
 				</p>
 				<p>
 					<code>Usage: hadoop archive -archiveName NAME &lt;src&gt;* &lt;dest&gt;</code>
@@ -133,7 +133,7 @@
 			<section>
 				<title> distcp </title>
 				<p>
-					Copy file or directories recursively. More information can be found at <a href="distcp.html">Hadoop DistCp Guide</a>.
+					Copy file or directories recursively. More information can be found at <a href="distcp.html">DistCp Guide</a>.
 				</p>
 				<p>
 					<code>Usage: hadoop distcp &lt;srcurl&gt; &lt;desturl&gt;</code>
@@ -155,21 +155,22 @@
 			<section>
 				<title> fs </title>
 				<p>
-					<code>Usage: hadoop fs [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
-					[COMMAND_OPTIONS]</code>
+					Runs a generic filesystem user client.
 				</p>
 				<p>
-					Runs a generic filesystem user client.
+					<code>Usage: hadoop fs [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
+					[COMMAND_OPTIONS]</code>
 				</p>
 				<p>
-					The various COMMAND_OPTIONS can be found at <a href="hdfs_shell.html">Hadoop FS Shell Guide</a>.
+					The various COMMAND_OPTIONS can be found at 
+					<a href="http://hadoop.apache.org/common/docs/current/file_system_shell.html">File System Shell Guide</a>.
 				</p>   
 			</section>
 			
 			<section>
 				<title> fsck </title>
 				<p>
-					Runs a HDFS filesystem checking utility. See <a href="hdfs_user_guide.html#Fsck">Fsck</a> for more info.
+					Runs a HDFS filesystem checking utility. See <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Fsck">Fsck</a> for more info.
 				</p> 
 				<p><code>Usage: hadoop fsck [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
 				&lt;path&gt; [-move | -delete | -openforwrite] [-files [-blocks 
@@ -220,12 +221,12 @@
 					<code>Usage: hadoop jar &lt;jar&gt; [mainClass] args...</code>
 				</p>
 				<p>
-					The streaming jobs are run via this command. Examples can be referred from 
-					<a href="streaming.html#More+usage+examples">Streaming examples</a>
+					The streaming jobs are run via this command. For examples, see 
+					<a href="streaming.html">Hadoop Streaming</a>.
 				</p>
 				<p>
-					Word count example is also run using jar command. It can be referred from
-					<a href="mapred_tutorial.html#Usage">Wordcount example</a>
+					The WordCount example is also run using jar command. For examples, see the
+					<a href="mapred_tutorial.html">MapReduce Tutorial</a>.
 				</p>
 			</section>
 			
@@ -401,24 +402,27 @@
 			<section>
 				<title> CLASSNAME </title>
 				<p>
-					 hadoop script can be used to invoke any class.
+					 Hadoop script can be used to invoke any class.
 				</p>
 				<p>
-					<code>Usage: hadoop CLASSNAME</code>
+					 Runs the class named CLASSNAME.
 				</p>
+
 				<p>
-					 Runs the class named CLASSNAME.
+					<code>Usage: hadoop CLASSNAME</code>
 				</p>
+
 			</section>
     </section>
 		<section>
 			<title> Administration Commands </title>
-			<p>Commands useful for administrators of a hadoop cluster.</p>
+			<p>Commands useful for administrators of a Hadoop cluster.</p>
 			<section>
 				<title> balancer </title>
 				<p>
 					Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the 
-					rebalancing process. See <a href="hdfs_user_guide.html#Rebalancer">Rebalancer</a> for more details.
+					rebalancing process. For more details see 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Rebalancer">Rebalancer</a>.
 				</p>
 				<p>
 					<code>Usage: hadoop balancer [-threshold &lt;threshold&gt;]</code>
@@ -472,7 +476,7 @@
 			           <tr>
 			          	<td><code>-rollback</code></td>
 			            <td>Rollsback the datanode to the previous version. This should be used after stopping the datanode 
-			            and distributing the old hadoop version.</td>
+			            and distributing the old Hadoop version.</td>
 			           </tr>
 			     </table>
 			</section>
@@ -584,7 +588,7 @@
         </tr>
         <tr>
         <td><code>-refreshQueueAcls</code></td>
-        <td> Refresh the queue acls used by hadoop, to check access during submissions
+        <td> Refresh the queue acls used by Hadoop, to check access during submissions
         and administration of the job by the user. The properties present in
         <code>mapred-queue-acls.xml</code> is reloaded by the queue manager.</td>
         </tr>
@@ -615,11 +619,11 @@
 			<section>
 				<title> namenode </title>
 				<p>
-					Runs the namenode. More info about the upgrade, rollback and finalize is at 
-					<a href="hdfs_user_guide.html#Upgrade+and+Rollback">Upgrade Rollback</a>
+					Runs the namenode. For more information about upgrade, rollback and finalize see 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Upgrade+and+Rollback">Upgrade and Rollback</a>.
 				</p>
 				<p>
-					<code>Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint]</code>
+					<code>Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] | [-checkpoint] | [-backup]</code>
 				</p>
 				<table>
 			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
@@ -642,12 +646,12 @@
 			           </tr>
 			           <tr>
 			          	<td><code>-upgrade</code></td>
-			            <td>Namenode should be started with upgrade option after the distribution of new hadoop version.</td>
+			            <td>Namenode should be started with upgrade option after the distribution of new Hadoop version.</td>
 			           </tr>
 			           <tr>
 			          	<td><code>-rollback</code></td>
 			            <td>Rollsback the namenode to the previous version. This should be used after stopping the cluster 
-			            and distributing the old hadoop version.</td>
+			            and distributing the old Hadoop version.</td>
 			           </tr>
 			           <tr>
 			          	<td><code>-finalize</code></td>
@@ -657,18 +661,33 @@
 			           <tr>
 			          	<td><code>-importCheckpoint</code></td>
 			            <td>Loads image from a checkpoint directory and saves it into the current one. Checkpoint directory 
-			            is read from property fs.checkpoint.dir</td>
+			            is read from property fs.checkpoint.dir
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Import+checkpoint">Import Checkpoint</a>).
+			            </td>
+			           </tr>
+			            <tr>
+			          	<td><code>-checkpoint</code></td>
+			            <td>Enables checkpointing 
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node">Checkpoint Node</a>).</td>
+			           </tr>
+			            <tr>
+			          	<td><code>-backup</code></td>
+			            <td>Enables checkpointing and maintains an in-memory, up-to-date copy of the file system namespace 
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Backup+Node">Backup Node</a>).</td>
 			           </tr>
 			     </table>
 			</section>
 			
 			<section>
 				<title> secondarynamenode </title>
-				<p>
-					Use of the Secondary NameNode has been deprecated. Instead, consider using a 
-					<a href="hdfs_user_guide.html#Checkpoint+node">Checkpoint node</a> or 
-					<a href="hdfs_user_guide.html#Backup+node">Backup node</a>. Runs the HDFS secondary 
-					namenode. See <a href="hdfs_user_guide.html#Secondary+NameNode">Secondary NameNode</a> 
+				<note>
+					The Secondary NameNode has been deprecated. Instead, consider using the
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node">Checkpoint Node</a> or 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Backup+Node">Backup Node</a>. 
+				</note>
+				<p>	
+					Runs the HDFS secondary 
+					namenode. See <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Secondary+NameNode">Secondary NameNode</a> 
 					for more info.
 				</p>
 				<p>

Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/distcp.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/distcp.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/distcp.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/distcp.xml Fri Sep 18 02:20:48 2009
@@ -30,10 +30,10 @@
       <title>Overview</title>
 
       <p>DistCp (distributed copy) is a tool used for large inter/intra-cluster
-      copying. It uses Map/Reduce to effect its distribution, error
+      copying. It uses MapReduce to effect its distribution, error
       handling and recovery, and reporting. It expands a list of files and
       directories into input to map tasks, each of which will copy a partition
-      of the files specified in the source list. Its Map/Reduce pedigree has
+      of the files specified in the source list. Its MapReduce pedigree has
       endowed it with some quirks in both its semantics and execution. The
       purpose of this document is to offer guidance for common tasks and to
       elucidate its model.</p>
@@ -45,36 +45,35 @@
 
       <section>
         <title>Basic</title>
-        <p>The most common invocation of DistCp is an inter-cluster copy:</p>
-        <p><code>bash$ hadoop distcp hdfs://nn1:8020/foo/bar \</code><br/>
-           <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 hdfs://nn2:8020/bar/foo</code></p>
+    <p>The most common invocation of DistCp is an inter-cluster copy:</p>
+<source>
+bash$ hadoop distcp hdfs://nn1:8020/foo/bar \ 
+            hdfs://nn2:8020/bar/foo 
+</source>             
 
         <p>This will expand the namespace under <code>/foo/bar</code> on nn1
         into a temporary file, partition its contents among a set of map
         tasks, and start a copy on each TaskTracker from nn1 to nn2. Note
         that DistCp expects absolute paths.</p>
 
-        <p>One can also specify multiple source directories on the command
-        line:</p>
-        <p><code>bash$ hadoop distcp hdfs://nn1:8020/foo/a \</code><br/>
-           <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 hdfs://nn1:8020/foo/b \</code><br/>
-           <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-                 hdfs://nn2:8020/bar/foo</code></p>
-
-        <p>Or, equivalently, from a file using the <code>-f</code> option:<br/>
-        <code>bash$ hadoop distcp -f hdfs://nn1:8020/srclist \</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              &nbsp;hdfs://nn2:8020/bar/foo</code><br/></p>
-
-        <p>Where <code>srclist</code> contains<br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b</code></p>
+    <p>One can also specify multiple source directories on the command line:</p>
+<source>
+bash$ hadoop distcp hdfs://nn1:8020/foo/a \ 
+            hdfs://nn1:8020/foo/b \ 
+            hdfs://nn2:8020/bar/foo 
+</source>             
+
+<p>Or, equivalently, from a file using the <code>-f</code> option:</p>
+<source>
+bash$ hadoop distcp -f hdfs://nn1:8020/srclist \ 
+            hdfs://nn2:8020/bar/foo 
+</source>          
+
+<p>Where <code>srclist</code> contains:</p> 
+<source>
+hdfs://nn1:8020/foo/a 
+hdfs://nn1:8020/foo/b 
+</source>
 
         <p>When copying from multiple sources, DistCp will abort the copy with
         an error message if two sources collide, but collisions at the
@@ -89,11 +88,11 @@
         both the source and destination file systems. For HDFS, both the source
         and destination must be running the same version of the protocol or use
         a backwards-compatible protocol (see <a href="#cpver">Copying Between
-        Versions</a>).</p>
+        Versions of HDFS</a>).</p>
 
         <p>After a copy, it is recommended that one generates and cross-checks
         a listing of the source and destination to verify that the copy was
-        truly successful. Since DistCp employs both Map/Reduce and the
+        truly successful. Since DistCp employs both MapReduce and the
         FileSystem API, issues in or between any of the three could adversely
         and silently affect the copy. Some have had success running with
         <code>-update</code> enabled to perform a second pass, but users should
@@ -107,11 +106,13 @@
 
       </section> <!-- Basic -->
 
+
       <section id="options">
         <title>Options</title>
 
         <section>
         <title>Option Index</title>
+        <p></p>
         <table>
           <tr><th> Flag </th><th> Description </th><th> Notes </th></tr>
 
@@ -150,7 +151,7 @@
               <td>Overwrite destination</td>
               <td>If a map fails and <code>-i</code> is not specified, all the
               files in the split, not only those that failed, will be recopied.
-              As discussed in the <a href="#uo">following</a>, it also changes
+              As discussed in <a href="#uo">Update and Overwrite</a>, it also changes
               the semantics for generating destination paths, so users should
               use this carefully.
               </td></tr>
@@ -159,8 +160,8 @@
               <td>As noted in the preceding, this is not a &quot;sync&quot;
               operation. The only criterion examined is the source and
               destination file sizes; if they differ, the source file
-              replaces the destination file. As discussed in the
-              <a href="#uo">following</a>, it also changes the semantics for
+              replaces the destination file. As discussed in 
+              <a href="#uo">Update and Overwrite</a>, it also changes the semantics for
               generating destination paths, so users should use this carefully.
               </td></tr>
           <tr><td><code>-f &lt;urilist_uri&gt;</code></td>
@@ -187,7 +188,9 @@
 
         </table>
 
-      </section>
+      </section> <!-- Option Index -->
+
+
 
       <section id="Symbolic-Representations">
         <title>Symbolic Representations</title>
@@ -200,7 +203,7 @@
           <li>1230k = 1230 * 1024 = 1259520</li>
           <li>891g = 891 * 1024^3 = 956703965184</li>
         </ul>
-      </section>
+      </section> <!-- Symbolic-Representations -->
 
       <section id="uo">
         <title>Update and Overwrite</title>
@@ -210,12 +213,15 @@
         <code>/foo/b</code> to <code>/bar/foo</code>, where the sources contain
         the following:</p>
 
-        <p><code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a/aa</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a/ab</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b/ba</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b/ab</code></p>
+        
+<source>
+    hdfs://nn1:8020/foo/a 
+    hdfs://nn1:8020/foo/a/aa 
+    hdfs://nn1:8020/foo/a/ab 
+    hdfs://nn1:8020/foo/b 
+    hdfs://nn1:8020/foo/b/ba 
+    hdfs://nn1:8020/foo/b/ab 
+</source>
 
         <p>If either <code>-update</code> or <code>-overwrite</code> is set,
         then both sources will map an entry to <code>/bar/foo/ab</code> at the
@@ -226,46 +232,51 @@
         <p>In the default case, both <code>/bar/foo/a</code> and
         <code>/bar/foo/b</code> will be created and neither will collide.</p>
 
-        <p>Now consider a legal copy using <code>-update</code>:<br/>
-        <code>distcp -update hdfs://nn1:8020/foo/a \</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              hdfs://nn1:8020/foo/b \</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-              hdfs://nn2:8020/bar</code></p>
-
-        <p>With sources/sizes:</p>
-
-        <p><code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a/aa 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/a/ab 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b/ba 64</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn1:8020/foo/b/bb 32</code></p>
-
-        <p>And destination/sizes:</p>
-
-        <p><code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/aa 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/ba 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/bb 64</code></p>
-
-        <p>Will effect:</p>
-
-        <p><code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/aa 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/ab 32</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/ba 64</code><br/>
-        <code>&nbsp;&nbsp;&nbsp;&nbsp;hdfs://nn2:8020/bar/bb 32</code></p>
+<p>Now consider a legal copy using <code>-update</code>:</p>
+<source>
+distcp -update hdfs://nn1:8020/foo/a \ 
+    hdfs://nn1:8020/foo/b \ 
+    hdfs://nn1:8020/foo/file1 \
+    hdfs://nn2:8020/bar 
+</source>
+
+<p>With sources/sizes:</p>
+<source>
+    hdfs://nn1:8020/foo/a 
+    hdfs://nn1:8020/foo/a/aa 32 
+    hdfs://nn1:8020/foo/a/ab 32 
+    hdfs://nn1:8020/foo/b 
+    hdfs://nn1:8020/foo/b/ba 64 
+    hdfs://nn1:8020/foo/b/bb 32 
+    hdfs://nn1:8020/foo/file1 20
+</source>
+
+<p>And destination/sizes:</p>
+<source>
+    hdfs://nn2:8020/bar 
+    hdfs://nn2:8020/bar/aa 32 
+    hdfs://nn2:8020/bar/ba 32 
+    hdfs://nn2:8020/bar/bb 64 
+    hdfs://nn1:8020/foo/file1 15
+</source>
+
+<p>Will effect:</p>
+<source>
+    hdfs://nn2:8020/bar 
+    hdfs://nn2:8020/bar/aa 32 
+    hdfs://nn2:8020/bar/ab 32 
+    hdfs://nn2:8020/bar/ba 64 
+    hdfs://nn2:8020/bar/bb 32 
+    hdfs://nn1:8020/foo/file1 20
+</source>
 
         <p>Only <code>aa</code> is not overwritten on nn2. If
         <code>-overwrite</code> were specified, all elements would be
         overwritten.</p>
 
-      </section> <!-- Update and Overwrite -->
+    </section> <!-- Update and Overwrite -->
 
-      </section> <!-- Options -->
+    </section> <!-- Options -->
 
     </section> <!-- Usage -->
 
@@ -273,7 +284,7 @@
       <title>Appendix</title>
 
       <section>
-        <title>Map sizing</title>
+        <title>Map Sizing</title>
 
           <p>DistCp makes a faint attempt to size each map comparably so that
           each copies roughly the same number of bytes. Note that files are the
@@ -293,7 +304,7 @@
       </section>
 
       <section id="cpver">
-        <title>Copying between versions of HDFS</title>
+        <title>Copying Between Versions of HDFS</title>
 
         <p>For copying between two different versions of Hadoop, one will
         usually use HftpFileSystem. This is a read-only FileSystem, so DistCp
@@ -306,7 +317,7 @@
       </section>
 
       <section>
-        <title>Map/Reduce and other side-effects</title>
+        <title>MapReduce and Other Side-effects</title>
 
         <p>As has been mentioned in the preceding, should a map fail to copy
         one of its inputs, there will be several side-effects.</p>

Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml Fri Sep 18 02:20:48 2009
@@ -18,7 +18,7 @@
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
 <document>
   <header>
-    <title>Fair Scheduler Guide</title>
+    <title>Fair Scheduler</title>
   </header>
   <body>
 
@@ -26,7 +26,7 @@
       <title>Purpose</title>
 
       <p>This document describes the Fair Scheduler, a pluggable
-        Map/Reduce scheduler for Hadoop which provides a way to share
+        MapReduce scheduler for Hadoop which provides a way to share
         large clusters.</p>
     </section>
 
@@ -148,7 +148,7 @@
           The following parameters can be set in <em>mapred-site.xml</em>
           to affect the behavior of the fair scheduler:
         </p>
-        <p><strong>Basic Parameters:</strong></p>
+        <p><strong>Basic Parameters</strong></p>
         <table>
           <tr>
           <th>Name</th><th>Description</th>
@@ -195,7 +195,8 @@
           </td>
           </tr>
         </table>
-        <p><strong>Advanced Parameters:</strong></p>
+        <p> <br></br></p>
+        <p><strong>Advanced Parameters</strong> </p>
         <table>
           <tr>
           <th>Name</th><th>Description</th>
@@ -532,7 +533,7 @@
      implementing a "shortest job first" policy which reduces response
      times for interactive jobs even further.
      These extension points are listed in
-     <a href="#Advanced+Parameters">advanced mapred-site.xml properties</a>.
+     <a href="#Scheduler+Parameters+in+mapred-site.xml">Advanced Parameters</a>.
      </p>
     </section>
     -->

Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml?rev=816439&r1=816438&r2=816439&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml Fri Sep 18 02:20:48 2009
@@ -18,11 +18,11 @@
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
 <document>
         <header>
-        <title>Archives Guide</title>
+        <title>Hadoop Archives Guide</title>
         </header>
         <body>
         <section>
-        <title> What are Hadoop archives? </title>
+        <title>Overview</title>
         <p>
         Hadoop archives are special format archives. A Hadoop archive
         maps to a file system directory. A Hadoop archive always has a *.har
@@ -32,8 +32,9 @@
         within the part files. 
         </p>
         </section>
+        
         <section>
-        <title> How to create an archive? </title>
+        <title> How to Create an Archive </title>
         <p>
         <code>Usage: hadoop archive -archiveName name &lt;src&gt;* &lt;dest&gt;</code>
         </p>
@@ -42,8 +43,8 @@
         An example would be foo.har. The name should have a *.har extension. 
         The inputs are file system pathnames which work as usual with regular
         expressions. The destination directory would contain the archive.
-        Note that this is a Map/Reduce job that creates the archives. You would
-        need a map reduce cluster to run this. The following is an example:</p>
+        Note that this is a MapReduce job that creates the archives. You would
+        need a MapReduce cluster to run this. The following is an example:</p>
         <p>
         <code>hadoop archive -archiveName foo.har /user/hadoop/dir1 /user/hadoop/dir2 /user/zoo/</code>
         </p><p>
@@ -52,28 +53,29 @@
         The sources are not changed or removed when an archive is created.
         </p>
         </section>
+        
         <section>
-        <title> How to look up files in archives? </title>
+        <title> How to Look Up Files in Archives </title>
         <p>
         The archive exposes itself as a file system layer. So all the fs shell
         commands in the archives work but with a different URI. Also, note that
-        archives are immutable. So, rename's, deletes and creates return
-        an error. URI for Hadoop Archives is 
+        archives are immutable. So, rename, delete and create will return
+        an error. The URI for Hadoop Archives is:
         </p><p><code>har://scheme-hostname:port/archivepath/fileinarchive</code></p><p>
         If no scheme is provided it assumes the underlying filesystem. 
-        In that case the URI would look like 
+        In that case the URI would look like this:
         </p><p><code>
         har:///archivepath/fileinarchive</code></p>
         <p>
         Here is an example of archive. The input to the archives is /dir. The directory dir contains 
-        files filea, fileb. To archive /dir to /user/hadoop/foo.har, the command is 
+        files filea, fileb. To archive /dir to /user/hadoop/foo.har, the command is: 
         </p>
         <p><code>hadoop archive -archiveName foo.har /dir /user/hadoop</code>
         </p><p>
-        To get file listing for files in the created archive 
+        To get file listing for files in the created archive: 
         </p>
         <p><code>hadoop dfs -lsr har:///user/hadoop/foo.har</code></p>
-        <p>To cat filea in archive -
+        <p>To cat filea in archive:
         </p><p><code>hadoop dfs -cat har:///user/hadoop/foo.har/dir/filea</code></p>
         </section>
 	</body>



Mime
View raw message