hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From omal...@apache.org
Subject svn commit: r816409 - in /hadoop/common/trunk: ./ src/docs/src/documentation/ src/docs/src/documentation/content/xdocs/ src/docs/src/documentation/resources/images/
Date Thu, 17 Sep 2009 23:24:39 GMT
Author: omalley
Date: Thu Sep 17 23:24:38 2009
New Revision: 816409

URL: http://svn.apache.org/viewvc?rev=816409&view=rev
Log:
HADOOP-6217. Update documentation for project split. (Corinne Chandel via 
omalley)

Added:
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/file_system_shell.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml
    hadoop/common/trunk/src/docs/src/documentation/resources/images/common-logo.jpg   (with props)
Removed:
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/SLG_user_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/distcp.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/fair_scheduler.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hadoop_archives.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hdfs_design.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hdfs_shell.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/hod_user_guide.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/quickstart.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/vaidya.xml
Modified:
    hadoop/common/trunk/CHANGES.txt
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/index.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml
    hadoop/common/trunk/src/docs/src/documentation/content/xdocs/tabs.xml
    hadoop/common/trunk/src/docs/src/documentation/skinconf.xml

Modified: hadoop/common/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/CHANGES.txt?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/CHANGES.txt (original)
+++ hadoop/common/trunk/CHANGES.txt Thu Sep 17 23:24:38 2009
@@ -562,6 +562,9 @@
     HADOOP-6216. Support comments in host files.  (Ravi Phulari and Dmytro
     Molkov via szetszwo)
 
+    HADOOP-6217. Update documentation for project split. (Corinne Chandel via 
+    omalley)
+
   OPTIMIZATIONS
 
     HADOOP-5595. NameNode does not need to run a replicator to choose a

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml Thu Sep 17 23:24:38 2009
@@ -33,20 +33,20 @@
       Hadoop clusters ranging from a few nodes to extremely large clusters with 
       thousands of nodes.</p>
       <p>
-      To play with Hadoop, you may first want to install Hadoop on a single machine (see <a href="quickstart.html"> Hadoop Quick Start</a>).
+      To play with Hadoop, you may first want to install Hadoop on a single machine (see <a href="single_node_setup.html"> Single Node Setup</a>).
       </p>
     </section>
     
     <section>
-      <title>Pre-requisites</title>
+      <title>Prerequisites</title>
       
       <ol>
         <li>
-          Make sure all <a href="quickstart.html#PreReqs">requisite</a> software 
+          Make sure all <a href="single_node_setup.html#PreReqs">required software</a> 
           is installed on all nodes in your cluster.
         </li>
         <li>
-          <a href="quickstart.html#Download">Get</a> the Hadoop software.
+          <a href="single_node_setup.html#Download">Download</a> the Hadoop software.
         </li>
       </ol>
     </section>
@@ -81,7 +81,7 @@
         <ol>
           <li>
             Read-only default configuration - 
-            <a href="ext:core-default">src/core/core-default.xml</a>, 
+            <a href="ext:common-default">src/common/common-default.xml</a>, 
             <a href="ext:hdfs-default">src/hdfs/hdfs-default.xml</a> and 
             <a href="ext:mapred-default">src/mapred/mapred-default.xml</a>.
           </li>
@@ -94,8 +94,8 @@
         </ol>
       
         <p>To learn more about how the Hadoop framework is controlled by these 
-        configuration files, look 
-        <a href="ext:api/org/apache/hadoop/conf/configuration">here</a>.</p>
+        configuration files see
+        <a href="ext:api/org/apache/hadoop/conf/configuration">Class Configuration</a>.</p>
       
         <p>Additionally, you can control the Hadoop scripts found in the 
         <code>bin/</code> directory of the distribution, by setting site-specific 
@@ -271,16 +271,6 @@
 		        TaskTrackers.
 		      </td>
   		    </tr>
-		  </table>
-      
-      <p><br/><code> conf/mapred-queues.xml</code></p>
-      
-      <table>
-       <tr>
-          <th>Parameter</th>
-          <th>Value</th> 
-          <th>Notes</th>
-       </tr>
         <tr>
           <td>mapred.queue.names</td>
           <td>Comma separated list of queues to which jobs can be submitted.</td>
@@ -289,8 +279,8 @@
             with the name as <em>default</em>. Hence, this parameter's
             value should always contain the string <em>default</em>.
             Some job schedulers supported in Hadoop, like the 
-            <a href="capacity_scheduler.html">Capacity 
-            Scheduler</a>, support multiple queues. If such a scheduler is
+            <a href="http://hadoop.apache.org/mapreduce/docs/current/capacity_scheduler.html">Capacity Scheduler</a>, 
+            support multiple queues. If such a scheduler is
             being used, the list of configured queue names must be
             specified here. Once queues are defined, users can submit
             jobs to a queue using the property name 
@@ -313,6 +303,16 @@
             <em>mapred.queue.queue-name.acl-name</em>, defined below.
           </td>
         </tr>
+		  </table>
+      
+      <p><br/><code> conf/mapred-queue-acls.xml</code></p>
+      
+      <table>
+       <tr>
+          <th>Parameter</th>
+          <th>Value</th> 
+          <th>Notes</th>
+       </tr>
         <tr>
           <td>mapred.queue.<em>queue-name</em>.acl-submit-job</td>
           <td>List of users and groups that can submit jobs to the
@@ -340,15 +340,6 @@
             his/her own job, irrespective of the ACLs.
           </td>
         </tr>
-        <tr>
-          <td>mapred.queue.<em>queue-name</em>.state</td>
-          <td>Specifies whether <em>queue-name</em> is running or stopped</td> 
-          <td>
-            Jobs can be submitted to a queue only if it is in the 
-            <em>running</em> state. However, jobs which are already running
-            when a queue is stopped will be allowed to finish.
-          </td>
-        </tr>
       </table>
       
 
@@ -401,10 +392,18 @@
                   </tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.child.java.opts</td>
+                    <td>mapred.map.child.java.opts</td>
                     <td>-Xmx512M</td>
                     <td>
-                      Larger heap-size for child jvms of maps/reduces. 
+                      Larger heap-size for child jvms of maps. 
+                    </td>
+                  </tr>
+                  <tr>
+                    <td>conf/mapred-site.xml</td>
+                    <td>mapred.reduce.child.java.opts</td>
+                    <td>-Xmx512M</td>
+                    <td>
+                      Larger heap-size for child jvms of reduces. 
                     </td>
                   </tr>
                   <tr>
@@ -474,9 +473,17 @@
                   </tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.child.java.opts</td>
+                    <td>mapred.map.child.java.opts</td>
+                    <td>-Xmx512M</td>
+                    <td>
+                      Larger heap-size for child jvms of maps. 
+                    </td>
+                  </tr>
+                  <tr>
+                    <td>conf/mapred-site.xml</td>
+                    <td>mapred.reduce.child.java.opts</td>
                     <td>-Xmx1024M</td>
-                    <td>Larger heap-size for child jvms of maps/reduces.</td>
+                    <td>Larger heap-size for child jvms of reduces.</td>
                   </tr>
                 </table>
               </li>
@@ -486,18 +493,18 @@
         <title> Memory management</title>
         <p>Users/admins can also specify the maximum virtual memory 
         of the launched child-task, and any sub-process it launches 
-        recursively, using <code>mapred.child.ulimit</code>. Note that
-        the value set here is a per process limit.
-        The value for <code>mapred.child.ulimit</code> should be specified 
-        in kilo bytes (KB). And also the value must be greater than
+        recursively, using <code>mapred.{map|reduce}.child.ulimit</code>. Note 
+        that the value set here is a per process limit.
+        The value for <code>mapred.{map|reduce}.child.ulimit</code> should be 
+        specified in kilo bytes (KB). And also the value must be greater than
         or equal to the -Xmx passed to JavaVM, else the VM might not start. 
         </p>
         
         <p>Note: <code>mapred.child.java.opts</code> are used only for 
         configuring the launched child tasks from task tracker. Configuring 
-        the memory options for daemons is documented in 
+        the memory options for daemons is documented under 
         <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
-        cluster_setup.html </a></p>
+        Configuring the Environment of the Hadoop Daemons</a>.</p>
         
         <p>The memory available to some parts of the framework is also
         configurable. In map and reduce tasks, performance may be influenced
@@ -658,11 +665,13 @@
             distribution. The task tracker uses this executable to 
             launch and kill tasks. The setuid executable switches to
             the user who has submitted the job and launches or kills
-            the tasks. Currently, this task controller 
-            opens up permissions to local files and directories used 
-            by the tasks such as the job jar files, distributed archive 
-            files, intermediate files and task log files. In future,
-            it is expected that stricter file permissions are used.
+            the tasks. For maximum security, this task controller 
+            sets up restricted permissions and user/group ownership of
+            local files and directories used by the tasks such as the
+            job jar files, intermediate files and task log files. Currently
+            permissions on distributed cache files are opened up to be
+            accessible by all users. In future, it is expected that stricter
+            file permissions are set for these files too.
             </td>
             </tr>
             </table>
@@ -704,18 +713,32 @@
             </p>
             
             <p>
-            The executable must be deployed as a setuid executable, by changing
-            the ownership to <em>root</em>, group ownership to that of tasktracker
-            and giving it permissions <em>4510</em>.Please take a note that,
-            group which owns task-controller should contain only tasktracker
-            as its memeber and not users who submit jobs.
+            The executable must have specific permissions as follows. The
+            executable should have <em>6050 or --Sr-s---</em> permissions
+            user-owned by root(super-user) and group-owned by a group 
+            of which only the TaskTracker's user is the sole group member. 
+            For example, let's say that the TaskTracker is run as user
+            <em>mapred</em> who is part of the groups <em>users</em> and
+            <em>mapredGroup</em> any of them being the primary group.
+            Let also be that <em>users</em> has both <em>mapred</em> and
+            another user <em>X</em> as its members, while <em>mapredGroup</em>
+            has only <em>mapred</em> as its member. Going by the above
+            description, the setuid/setgid executable should be set
+            <em>6050 or --Sr-s---</em> with user-owner as <em>mapred</em> and
+            group-owner as <em>mapredGroup</em> which has
+            only <em>mapred</em> as its member(and not <em>users</em> which has
+            <em>X</em> also as its member besides <em>mapred</em>).
             </p>
             
             <p>The executable requires a configuration file called 
             <em>taskcontroller.cfg</em> to be
             present in the configuration directory passed to the ant target 
             mentioned above. If the binary was not built with a specific 
-            conf directory, the path defaults to <em>/path-to-binary/../conf</em>.
+            conf directory, the path defaults to
+            <em>/path-to-binary/../conf</em>. The configuration file must be
+            owned by the user running TaskTracker (user <em>mapred</em> in the
+            above example), group-owned by anyone and should have the
+            permissions <em>0400 or r--------</em>.
             </p>
             
             <p>The executable requires following configuration items to be 
@@ -730,17 +753,81 @@
             validate paths passed to the setuid executable in order to prevent
             arbitrary paths being passed to it.</td>
             </tr>
+            <tr>
+            <td>hadoop.log.dir</td>
+            <td>Path to hadoop log directory. Should be same as the value which
+            the TaskTracker is started with. This is required to set proper
+            permissions on the log files so that they can be written to by the user's
+            tasks and read by the TaskTracker for serving on the web UI.</td>
+            </tr>
             </table>
 
             <p>
-            The LinuxTaskController requires that paths leading up to
+            The LinuxTaskController requires that paths including and leading up to
             the directories specified in
-            <em>mapred.local.dir</em> and <em>hadoop.log.dir</em> to be 755
-            and directories themselves having 777 permissions.
+            <em>mapred.local.dir</em> and <em>hadoop.log.dir</em> to be set 755
+            permissions.
             </p>
             </section>
             
           </section>
+          <section>
+            <title>Monitoring Health of TaskTracker Nodes</title>
+            <p>Hadoop Map/Reduce provides a mechanism by which administrators 
+            can configure the TaskTracker to run an administrator supplied
+            script periodically to determine if a node is healthy or not.
+            Administrators can determine if the node is in a healthy state
+            by performing any checks of their choice in the script. If the
+            script detects the node to be in an unhealthy state, it must print
+            a line to standard output beginning with the string <em>ERROR</em>.
+            The TaskTracker spawns the script periodically and checks its 
+            output. If the script's output contains the string <em>ERROR</em>,
+            as described above, the node's status is reported as 'unhealthy'
+            and the node is black-listed on the JobTracker. No further tasks 
+            will be assigned to this node. However, the
+            TaskTracker continues to run the script, so that if the node
+            becomes healthy again, it will be removed from the blacklisted
+            nodes on the JobTracker automatically. The node's health
+            along with the output of the script, if it is unhealthy, is
+            available to the administrator in the JobTracker's web interface.
+            The time since the node was healthy is also displayed on the 
+            web interface.
+            </p>
+            
+            <section>
+            <title>Configuring the Node Health Check Script</title>
+            <p>The following parameters can be used to control the node health 
+            monitoring script in <em>mapred-site.xml</em>.</p>
+            <table>
+            <tr><th>Name</th><th>Description</th></tr>
+            <tr><td><code>mapred.healthChecker.script.path</code></td>
+            <td>Absolute path to the script which is periodically run by the 
+            TaskTracker to determine if the node is 
+            healthy or not. The file should be executable by the TaskTracker.
+            If the value of this key is empty or the file does 
+            not exist or is not executable, node health monitoring
+            is not started.</td>
+            </tr>
+            <tr>
+            <td><code>mapred.healthChecker.interval</code></td>
+            <td>Frequency at which the node health script is run, 
+            in milliseconds</td>
+            </tr>
+            <tr>
+            <td><code>mapred.healthChecker.script.timeout</code></td>
+            <td>Time after which the node health script will be killed by
+            the TaskTracker if unresponsive.
+            The node is marked unhealthy. if node health script times out.</td>
+            </tr>
+            <tr>
+            <td><code>mapred.healthChecker.script.args</code></td>
+            <td>Extra arguments that can be passed to the node health script 
+            when launched.
+            These should be comma separated list of arguments. </td>
+            </tr>
+            </table>
+            </section>
+          </section>
           
         </section>
         

Added: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/file_system_shell.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/file_system_shell.xml?rev=816409&view=auto
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/file_system_shell.xml (added)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/file_system_shell.xml Thu Sep 17 23:24:38 2009
@@ -0,0 +1,569 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+<document>
+	<header>
+		<title>File System Shell Guide</title>
+	</header>
+	<body>
+		<section>
+			<title>Overview</title>
+			<p>
+      The File System (FS) shell includes various shell-like commands that directly
+      interact with the Hadoop Distributed File System (HDFS) as well as other file systems that Hadoop supports,  
+      such as Local FS, HFTP FS, S3 FS, and others. The FS shell is invoked by: </p>
+
+    <source>bin/hdfs dfs &lt;args&gt;</source>
+    
+      <p>
+      All FS shell commands take path URIs as arguments. The URI
+      format is <em>scheme://autority/path</em>. For HDFS the scheme
+      is <em>hdfs</em>, and for the Local FS the scheme
+      is <em>file</em>. The scheme and authority are optional. If not
+      specified, the default scheme specified in the configuration is
+      used. An HDFS file or directory such as <em>/parent/child</em>
+      can be specified as <em>hdfs://namenodehost/parent/child</em> or
+      simply as <em>/parent/child</em> (given that your configuration
+      is set to point to <em>hdfs://namenodehost</em>). 
+      </p>
+     <p>
+      Most of the commands in FS shell behave like corresponding Unix
+      commands. Differences are described with each of the
+      commands. Error information is sent to <em>stderr</em> and the
+      output is sent to <em>stdout</em>.
+  </p>
+  
+  
+<!-- CAT --> 
+		<section>
+			<title> cat </title>
+			<p>
+				<code>Usage: hdfs dfs -cat URI [URI &#x2026;]</code>
+			</p>
+			<p>
+		   Copies source paths to <em>stdout</em>. 
+		   </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2 
+		   </code>
+				</li>
+				<li>
+					<code>hdfs dfs -cat file:///file3 /user/hadoop/file4 </code>
+				</li>
+			</ul>
+			<p>Exit Code:<br/>
+		   <code> Returns 0 on success and -1 on error. </code></p>
+		</section>
+		
+		
+<!-- CHGRP --> 
+		<section>
+			<title> chgrp </title>
+			<p>
+				<code>Usage: hdfs dfs -chgrp [-R] GROUP URI [URI &#x2026;]</code>
+			</p>
+			<p>
+	    Change group association of files. With <code>-R</code>, make the change recursively through the directory structure. 
+	    The user must be the owner of files, or else a super-user. 
+	    Additional information is in the <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_permissions_guide.html">HDFS Permissions Guide</a>.
+	    </p>
+		</section>
+		<section>
+			<title> chmod </title>
+			<p>
+				<code>Usage: hdfs dfs -chmod [-R] &lt;MODE[,MODE]... | OCTALMODE&gt; URI [URI &#x2026;]</code>
+			</p>
+			<p>
+	    Change the permissions of files. With <code>-R</code>, make the change recursively through the directory structure. 
+	    The user must be the owner of the file, or else a super-user. 
+	    Additional information is in the <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_permissions_guide.html">HDFS Permissions Guide</a>.
+	    </p>
+		</section>
+		
+		
+<!-- CHOWN --> 		
+		<section>
+			<title> chown </title>
+			<p>
+				<code>Usage: hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]</code>
+			</p>
+			<p>
+	    Change the owner of files. With <code>-R</code>, make the change recursively through the directory structure. 
+	    The user must be a super-user. 
+	    Additional information is in the <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_permissions_guide.html">HDFS Permissions Guide</a>.
+	    </p>
+		</section>
+		
+		
+<!-- COPYFROMLOCAL --> 		
+		<section>
+			<title>copyFromLocal</title>
+			<p>
+				<code>Usage: hdfs dfs -copyFromLocal &lt;localsrc&gt; URI</code>
+			</p>
+			<p>Similar to <a href="#put"><strong>put</strong></a> command, except that the source is restricted to a local file reference. </p>
+		</section>
+		
+		
+<!-- COPYTOLOCAL -->
+		<section>
+			<title> copyToLocal</title>
+			<p>
+				<code>Usage: hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI &lt;localdst&gt;</code>
+			</p>
+			<p> Similar to <a href="#get"><strong>get</strong></a> command, except that the destination is restricted to a local file reference.</p>
+		</section>
+		
+<!-- COUNT -->		
+		<section>
+			<title> count </title>
+			<p>
+				<code>Usage: hdfs dfs -count [-q]  &lt;paths&gt;</code>
+			</p>
+			<p>
+				Count the number of directories, files and bytes under the paths that match the specified file pattern. <br/><br/>
+				The output columns with <code>-count </code> are:<br/><br/>
+				<code>DIR_COUNT, FILE_COUNT, CONTENT_SIZE FILE_NAME</code> <br/><br/>
+				The output columns with <code>-count -q</code> are:<br/><br/>
+				<code>QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, 
+				DIR_COUNT, FILE_COUNT, CONTENT_SIZE, FILE_NAME</code>
+		   </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2 
+		   </code>
+				</li>
+				<li>
+					<code> hdfs dfs -count -q hdfs://nn1.example.com/file1
+		   </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error.</code>
+			</p>
+		</section>
+		
+		
+<!-- CP -->		
+		<section>
+			<title> cp </title>
+			<p>
+				<code>Usage: hdfs dfs -cp URI [URI &#x2026;] &lt;dest&gt;</code>
+			</p>
+			<p>
+	    Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
+	    <br/>
+	    Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2</code>
+				</li>
+				<li>
+					<code> hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error.</code>
+			</p>
+		</section>
+		
+<!-- DU -->
+		<section>
+			<title>du</title>
+			<p>
+				<code>Usage: hdfs dfs -du URI [URI &#x2026;]</code>
+			</p>
+			<p>
+	     Displays aggregate length of  files contained in the directory or the length of a file in case its just a file.<br/>
+	     Example:<br/><code>hdfs dfs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1</code><br/>
+	     Exit Code:<br/><code> Returns 0 on success and -1 on error. </code><br/></p>
+		</section>
+		
+<!-- DUS -->		
+		<section>
+			<title> dus </title>
+			<p>
+				<code>Usage: hdfs dfs -dus &lt;args&gt;</code>
+			</p>
+			<p>
+	    Displays a summary of file lengths.
+	   </p>
+		</section>
+		
+		
+<!-- EXPUNGE -->		
+		<section>
+			<title> expunge </title>
+			<p>
+				<code>Usage: hdfs dfs -expunge</code>
+			</p>
+			<p>Empty the Trash. Refer to the <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_design.html">HDFS Architecture Guide</a>
+			 for more information on the Trash feature.</p>
+		</section>
+
+
+<!-- GET -->			
+		<section>
+			<title> get </title>
+			<p>
+				<code>Usage: hdfs dfs -get [-ignorecrc] [-crc] &lt;src&gt; &lt;localdst&gt;</code>
+				<br/>
+			</p>
+			<p>
+	   Copy files to the local file system. Files that fail the CRC check may be copied with the  
+	   <code>-ignorecrc</code> option. Files and CRCs may be copied using the 
+	   <code>-crc</code> option.
+	  </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -get /user/hadoop/file localfile </code>
+				</li>
+				<li>
+					<code> hdfs dfs -get hdfs://nn.example.com/user/hadoop/file localfile</code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error. </code>
+			</p>
+		</section>
+		
+		
+<!-- GETMERGE -->			
+		<section>
+			<title> getmerge </title>
+			<p>
+				<code>Usage: hdfs dfs -getmerge &lt;src&gt; &lt;localdst&gt; [addnl]</code>
+			</p>
+			<p>
+	  Takes a source directory and a destination file as input and concatenates files in src into the destination local file. 
+	  Optionally <code>addnl</code> can be set to enable adding a newline character at the end of each file.  
+	  </p>
+		</section>
+		
+		
+<!-- LS -->		
+       <section>
+           <title>ls</title>
+           <p>
+               <code>Usage: hdfs dfs -ls &lt;args&gt;</code>
+           </p>
+           <p>For a file returns stat on the file with the following format:</p>
+           <p>
+               <code>permissions number_of_replicas userid  groupid  filesize modification_date modification_time filename</code>
+           </p>
+           <p>For a directory it returns list of its direct children as in unix.A directory is listed as:</p>
+           <p>
+               <code>permissions userid groupid modification_date modification_time dirname</code>
+           </p>
+           <p>Example:</p>
+           <p>
+               <code>hdfs dfs -ls /user/hadoop/file1 </code>
+           </p>
+           <p>Exit Code:</p>
+           <p>
+               <code>Returns 0 on success and -1 on error.</code>
+           </p>
+       </section>
+       
+       
+<!-- LSR -->       
+		<section>
+			<title>lsr</title>
+			<p><code>Usage: hdfs dfs -lsr &lt;args&gt;</code><br/>
+	      Recursive version of <code>ls</code>. Similar to Unix <code>ls -R</code>.
+	      </p>
+		</section>
+		
+		
+<!-- MKDIR -->  
+		<section>
+			<title> mkdir </title>
+			<p>
+				<code>Usage: hdfs dfs -mkdir &lt;paths&gt;</code>
+				<br/>
+			</p>
+			<p>
+	   Takes path uri's as argument and creates directories. The behavior is much like unix mkdir -p creating parent directories along the path.
+	  </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code>hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2 </code>
+				</li>
+				<li>
+					<code>hdfs dfs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir
+	  </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code>Returns 0 on success and -1 on error.</code>
+			</p>
+		</section>
+		
+		
+<!-- MOVEFROMLOCAL -->  
+		<section>
+			<title> moveFromLocal </title>
+			<p>
+				<code>Usage: dfs -moveFromLocal &lt;localsrc&gt; &lt;dst&gt;</code>
+			</p>
+			<p>Similar to <a href="#put"><strong>put</strong></a> command, except that the source <code>localsrc</code> is deleted after it's copied. </p>
+		</section>
+		
+		
+<!-- MOVETOLOCAL -->  
+		<section>
+			<title> moveToLocal</title>
+			<p>
+				<code>Usage: hdfs dfs -moveToLocal [-crc] &lt;src&gt; &lt;dst&gt;</code>
+			</p>
+			<p>Displays a "Not implemented yet" message.</p>
+		</section>
+		
+		
+<!-- MV -->  
+		<section>
+			<title> mv </title>
+			<p>
+				<code>Usage: hdfs dfs -mv URI [URI &#x2026;] &lt;dest&gt;</code>
+			</p>
+			<p>
+	    Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. 
+	    Moving files across file systems is not permitted.
+	    <br/>
+	    Example:
+	    </p>
+			<ul>
+				<li>
+					<code> hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2</code>
+				</li>
+				<li>
+					<code> hdfs dfs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1</code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error.</code>
+			</p>
+		</section>
+		
+		
+<!-- PUT --> 
+		<section>
+			<title> put </title>
+			<p>
+				<code>Usage: hdfs dfs -put &lt;localsrc&gt; ... &lt;dst&gt;</code>
+			</p>
+			<p>Copy single src, or multiple srcs from local file system to the destination file system. 
+			Also reads input from stdin and writes to destination file system.<br/>
+	   </p>
+			<ul>
+				<li>
+					<code> hdfs dfs -put localfile /user/hadoop/hadoopfile</code>
+				</li>
+				<li>
+					<code> hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir</code>
+				</li>
+				<li>
+					<code> hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile</code>
+				</li>
+				<li><code>hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile</code><br/>Reads the input from stdin.</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error. </code>
+			</p>
+		</section>
+		
+		
+<!-- RM --> 
+		<section>
+			<title> rm </title>
+			<p>
+				<code>Usage: hdfs dfs -rm [-skipTrash] URI [URI &#x2026;] </code>
+			</p>
+			<p>
+	   Delete files specified as args. Only deletes non empty directory and files. If the <code>-skipTrash</code> option
+	   is specified, the trash, if enabled, will be bypassed and the specified file(s) deleted immediately.  	This can be
+		   useful when it is necessary to delete files from an over-quota directory.
+	   Refer to rmr for recursive deletes.<br/>
+	   Example:
+	   </p>
+			<ul>
+				<li>
+					<code> hdfs dfs -rm hdfs://nn.example.com/file /user/hadoop/emptydir </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error.</code>
+			</p>
+		</section>
+		
+		
+<!-- RMR --> 
+		<section>
+			<title> rmr </title>
+			<p>
+				<code>Usage: hdfs dfs -rmr [-skipTrash] URI [URI &#x2026;]</code>
+			</p>
+			<p>Recursive version of delete. If the <code>-skipTrash</code> option
+		   is specified, the trash, if enabled, will be bypassed and the specified file(s) deleted immediately. This can be
+		   useful when it is necessary to delete files from an over-quota directory.<br/>
+	   Example:
+	   </p>
+			<ul>
+				<li>
+					<code> hdfs dfs -rmr /user/hadoop/dir </code>
+				</li>
+				<li>
+					<code> hdfs dfs -rmr hdfs://nn.example.com/user/hadoop/dir </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code> Returns 0 on success and -1 on error. </code>
+			</p>
+		</section>
+		
+		
+<!-- SETREP --> 
+		<section>
+			<title> setrep </title>
+			<p>
+				<code>Usage: hdfs dfs -setrep [-R] &lt;path&gt;</code>
+			</p>
+			<p>
+	   Changes the replication factor of a file. -R option is for recursively increasing the replication factor of files within a directory.
+	  </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -setrep -w 3 -R /user/hadoop/dir1 </code>
+				</li>
+			</ul>
+			<p>Exit Code:</p>
+			<p>
+				<code>Returns 0 on success and -1 on error. </code>
+			</p>
+		</section>
+		
+		
+<!-- STAT --> 
+		<section>
+			<title> stat </title>
+			<p>
+				<code>Usage: hdfs dfs -stat URI [URI &#x2026;]</code>
+			</p>
+			<p>
+	   Returns the stat information on the path.
+	   </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -stat path </code>
+				</li>
+			</ul>
+			<p>Exit Code:<br/>
+	   <code> Returns 0 on success and -1 on error.</code></p>
+		</section>
+		
+		
+<!-- TAIL--> 
+		<section>
+			<title> tail </title>
+			<p>
+				<code>Usage: hdfs dfs -tail [-f] URI </code>
+			</p>
+			<p>
+	   Displays last kilobyte of the file to stdout. -f option can be used as in Unix.
+	   </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -tail pathname </code>
+				</li>
+			</ul>
+			<p>Exit Code: <br/>
+	   <code> Returns 0 on success and -1 on error.</code></p>
+		</section>
+		
+		
+<!-- TEST --> 
+		<section>
+			<title> test </title>
+			<p>
+				<code>Usage: hdfs dfs -test -[ezd] URI</code>
+			</p>
+			<p>
+	   Options: <br/>
+	   -e check to see if the file exists. Return 0 if true. <br/>
+	   -z check to see if the file is zero length. Return 0 if true. <br/>
+	   -d check to see if the path is directory. Return 0 if true. <br/></p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hdfs dfs -test -e filename </code>
+				</li>
+			</ul>
+		</section>
+		
+		
+<!-- TEXT --> 
+		<section>
+			<title> text </title>
+			<p>
+				<code>Usage: hdfs dfs -text &lt;src&gt;</code>
+				<br/>
+			</p>
+			<p>
+	   Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
+	  </p>
+		</section>
+		
+		
+<!-- TOUCHZ --> 
+		<section>
+			<title> touchz </title>
+			<p>
+				<code>Usage: hdfs dfs -touchz URI [URI &#x2026;]</code>
+				<br/>
+			</p>
+			<p>
+	   Create a file of zero length.
+	   </p>
+			<p>Example:</p>
+			<ul>
+				<li>
+					<code> hadoop -touchz pathname </code>
+				</li>
+			</ul>
+			<p>Exit Code:<br/>
+	   <code> Returns 0 on success and -1 on error.</code></p>
+		</section>
+        </section>
+	</body>
+</document>

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/index.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/index.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/index.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/index.xml Thu Sep 17 23:24:38 2009
@@ -26,12 +26,23 @@
   
   <body>
   <p>
-  The Hadoop Documentation provides the information you need to get started using Hadoop, the Hadoop Distributed File System (HDFS), and Hadoop on Demand (HOD).
-  </p><p>
-Begin with the <a href="quickstart.html">Hadoop Quick Start</a> which shows you how to set up a single-node Hadoop installation. Then move on to the <a href="cluster_setup.html">Hadoop Cluster Setup</a> to learn how to set up a multi-node Hadoop installation. Once your Hadoop installation is in place, try out the <a href="mapred_tutorial.html">Hadoop Map/Reduce Tutorial</a>. 
-  </p><p>
-If you have more questions, you can ask on the <a href="ext:lists">Hadoop Core Mailing Lists</a> or browse the <a href="ext:archive">Mailing List Archives</a>.
-    </p>
+The Hadoop Common Documentation describes the common utilities and libraries that support the other Hadoop subprojects.  
+  </p>
+  <p>
+The Hadoop Common Documentation also includes the information you need to get started using Hadoop. 
+Begin with the Hadoop <a href="single_node_setup.html">Single Node Setup</a> which shows you how to set up a single-node Hadoop installation. 
+Then move on to the Hadoop <a href="cluster_setup.html">Cluster Setup</a> to learn how to set up a multi-node Hadoop installation. 
+</p>
+ <p>
+   Cluster environments commonly work in tandem with MapReduce applications and distributed file systems. 
+   For information about MapReduce see the 
+ <a href="http://hadoop.apache.org/mapreduce/docs/current/index.html">MapReduce Documentation</a>.
+   For information about the Hadoop Distributed File System (HDFS) see the 
+ <a href="http://hadoop.apache.org/hdfs/docs/current/index.html">HDFS Documentation</a>.
+  </p>  
+<p>
+If you have more questions, you can ask on the <a href="ext:lists">Hadoop Common Mailing Lists</a> or browse the <a href="ext:archive">Mailing List Archives</a>.
+</p>
   </body>
   
 </document>

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/native_libraries.xml Thu Sep 17 23:24:38 2009
@@ -149,7 +149,7 @@
         </li>
       </ul>
 
-      <p>Once you have the pre-requisites use the standard <code>build.xml</code> 
+      <p>Once you have the prerequisites use the standard <code>build.xml</code> 
       and pass along the <code>compile.native</code> flag (set to 
       <code>true</code>) to build the native hadoop library:</p>
 
@@ -186,13 +186,13 @@
       </section>
     </section>
     <section>
-      <title> Loading native libraries through DistributedCache </title>
+      <title> Loading Native Libraries Through DistributedCache </title>
       <p>User can load native shared libraries through  
-      <a href="mapred_tutorial.html#DistributedCache">DistributedCache</a>
-      for <em>distributing</em> and <em>symlinking</em> the library files</p>
+      <a href="http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#DistributedCache">DistributedCache</a> 
+      for <em>distributing</em> and <em>symlinking</em> the library files.</p>
       
       <p>Here is an example, describing how to distribute the library and
-      load it from map/reduce task. </p>
+      load it from a MapReduce task. </p>
       <ol>
       <li> First copy the library to the HDFS. <br/>
       <code>bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1</code>
@@ -202,7 +202,7 @@
       <code> DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so.1#mylib.so", conf);
       </code>
       </li>
-      <li> The map/reduce task can contain: <br/>
+      <li> The MapReduce task can contain: <br/>
       <code> System.loadLibrary("mylib.so"); </code>
       </li>
       </ol>

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/service_level_auth.xml Thu Sep 17 23:24:38 2009
@@ -34,17 +34,15 @@
     </section>
     
     <section>
-      <title>Pre-requisites</title>
+      <title>Prerequisites</title>
       
-      <p>Ensure that Hadoop is installed, configured and setup correctly. More
-      details:</p> 
+      <p>Make sure Hadoop is installed, configured and setup correctly. For more information see: </p> 
       <ul>
         <li>
-          <a href="quickstart.html">Hadoop Quick Start</a> for first-time users.
+          <a href="single_node_setup.html">Single Node Setup</a> for first-time users.
         </li>
         <li>
-          <a href="cluster_setup.html">Hadoop Cluster Setup</a> for large, 
-          distributed clusters.
+          <a href="cluster_setup.html">Cluster Setup</a> for large, distributed clusters.
         </li>
       </ul>
     </section>
@@ -55,7 +53,7 @@
       <p>Service Level Authorization is the initial authorization mechanism to
       ensure clients connecting to a particular Hadoop <em>service</em> have the
       necessary, pre-configured, permissions and are authorized to access the given
-      service. For e.g. a Map/Reduce cluster can use this mechanism to allow a
+      service. For example, a MapReduce cluster can use this mechanism to allow a
       configured list of users/groups to submit jobs.</p>
       
       <p>The <code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> configuration file 
@@ -198,33 +196,33 @@
         <title>Examples</title>
         
         <p>Allow only users <code>alice</code>, <code>bob</code> and users in the 
-        <code>mapreduce</code> group to submit jobs to the Map/Reduce cluster:</p>
+        <code>mapreduce</code> group to submit jobs to the MapReduce cluster:</p>
         
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.job.submission.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;alice,bob mapreduce&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.job.submission.protocol.acl&lt;/name&gt;
+     &lt;value&gt;alice,bob mapreduce&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
         <p></p><p>Allow only DataNodes running as the users who belong to the 
         group <code>datanodes</code> to communicate with the NameNode:</p> 
-        
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.datanode.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt; datanodes&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+ 
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.datanode.protocol.acl&lt;/name&gt;
+     &lt;value&gt;datanodes&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
         <p></p><p>Allow any user to talk to the HDFS cluster as a DFSClient:</p>
-        
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.client.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;*&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.client.protocol.acl&lt;/name&gt;
+     &lt;value&gt;*&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
       </section>
     </section>

Added: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml?rev=816409&view=auto
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml (added)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml Thu Sep 17 23:24:38 2009
@@ -0,0 +1,293 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+
+<document>
+  
+  <header>
+    <title>Single Node Setup</title>
+  </header>
+  
+  <body>
+  
+    <section>
+      <title>Purpose</title>
+      
+      <p>This document describes how to set up and configure a single-node Hadoop
+      installation so that you can quickly perform simple operations using Hadoop
+      MapReduce and the Hadoop Distributed File System (HDFS).</p>
+      
+    </section>
+    
+    <section id="PreReqs">
+      <title>Prerequisites</title>
+      
+      <section>
+        <title>Supported Platforms</title>
+        
+        <ul>
+          <li>
+            GNU/Linux is supported as a development and production platform. 
+            Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
+          </li>
+          <li>
+            Win32 is supported as a <em>development platform</em>. Distributed 
+            operation has not been well tested on Win32, so it is not 
+            supported as a <em>production platform</em>.
+          </li>
+        </ul>        
+      </section>
+      
+      <section>
+        <title>Required Software</title>
+        <p>Required software for Linux and Windows include:</p>
+        <ol>
+          <li>
+            Java<sup>TM</sup> 1.6.x, preferably from Sun, must be installed.
+          </li>
+          <li>
+            <strong>ssh</strong> must be installed and <strong>sshd</strong> must 
+            be running to use the Hadoop scripts that manage remote Hadoop 
+            daemons.
+          </li>
+        </ol>
+        <p>Additional requirements for Windows include:</p>
+        <ol>
+          <li>
+            <a href="http://www.cygwin.com/">Cygwin</a> - Required for shell 
+            support in addition to the required software above. 
+          </li>
+        </ol>
+      </section>
+
+      <section>
+        <title>Installing Software</title>
+          
+        <p>If your cluster doesn't have the requisite software you will need to
+        install it.</p>
+          
+        <p>For example on Ubuntu Linux:</p>
+        <p>
+          <code>$ sudo apt-get install ssh</code><br/>
+          <code>$ sudo apt-get install rsync</code>
+        </p>
+          
+        <p>On Windows, if you did not install the required software when you 
+        installed cygwin, start the cygwin installer and select the packages:</p>
+        <ul>
+          <li>openssh - the <em>Net</em> category</li>
+        </ul>
+      </section>
+      
+    </section>
+    
+    <section>
+      <title>Download</title>
+      
+      <p>
+        To get a Hadoop distribution, download a recent 
+        <a href="ext:releases">stable release</a> from one of the Apache Download
+        Mirrors.
+      </p>
+    </section>
+
+    <section>
+      <title>Prepare to Start the Hadoop Cluster</title>
+      <p>
+        Unpack the downloaded Hadoop distribution. In the distribution, edit the
+        file <code>conf/hadoop-env.sh</code> to define at least 
+        <code>JAVA_HOME</code> to be the root of your Java installation.
+      </p>
+
+	  <p>
+	    Try the following command:<br/>
+        <code>$ bin/hadoop</code><br/>
+        This will display the usage documentation for the <strong>hadoop</strong> 
+        script.
+      </p>
+      
+      <p>Now you are ready to start your Hadoop cluster in one of the three supported
+      modes:
+      </p>
+      <ul>
+        <li>Local (Standalone) Mode</li>
+        <li>Pseudo-Distributed Mode</li>
+        <li>Fully-Distributed Mode</li>
+      </ul>
+    </section>
+    
+    <section id="Local">
+      <title>Standalone Operation</title>
+      
+      <p>By default, Hadoop is configured to run in a non-distributed 
+      mode, as a single Java process. This is useful for debugging.</p>
+      
+      <p>
+        The following example copies the unpacked <code>conf</code> directory to 
+        use as input and then finds and displays every match of the given regular 
+        expression. Output is written to the given <code>output</code> directory.
+        <br/>
+        <code>$ mkdir input</code><br/>
+        <code>$ cp conf/*.xml input</code><br/>
+        <code>
+          $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
+        </code><br/>
+        <code>$ cat output/*</code>
+      </p>
+    </section>
+    
+    <section id="PseudoDistributed">
+      <title>Pseudo-Distributed Operation</title>
+
+	  <p>Hadoop can also be run on a single-node in a pseudo-distributed mode 
+	  where each Hadoop daemon runs in a separate Java process.</p>
+	  
+      <section>
+        <title>Configuration</title>
+        <p>Use the following:
+        <br/><br/>
+        <code>conf/core-site.xml</code>:</p>
+        
+        <source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;fs.default.name&lt;/name&gt;
+         &lt;value&gt;hdfs://localhost:9000&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>
+      
+        <p><br/><code>conf/hdfs-site.xml</code>:</p>  
+<source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;dfs.replication&lt;/name&gt;
+         &lt;value&gt;1&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>        
+        
+      
+        <p><br/><code>conf/mapred-site.xml</code>:</p>
+<source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;mapred.job.tracker&lt;/name&gt;
+         &lt;value&gt;localhost:9001&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>        
+        
+        
+        
+      </section>
+
+      <section>
+        <title>Setup passphraseless <em>ssh</em></title>
+        
+        <p>
+          Now check that you can ssh to the localhost without a passphrase:<br/>
+          <code>$ ssh localhost</code>
+        </p>
+        
+        <p>
+          If you cannot ssh to localhost without a passphrase, execute the 
+          following commands:<br/>
+   		  <code>$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa</code><br/>
+		  <code>$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys</code>
+		</p>
+      </section>
+    
+      <section>
+        <title>Execution</title>
+        
+        <p>
+          Format a new distributed-filesystem:<br/>
+          <code>$ bin/hadoop namenode -format</code>
+        </p>
+
+		<p>
+		  Start the hadoop daemons:<br/>
+          <code>$ bin/start-all.sh</code>
+        </p>
+
+        <p>The hadoop daemon log output is written to the 
+        <code>${HADOOP_LOG_DIR}</code> directory (defaults to 
+        <code>${HADOOP_HOME}/logs</code>).</p>
+
+        <p>Browse the web interface for the NameNode and the JobTracker; by
+        default they are available at:</p>
+        <ul>
+          <li>
+            <code>NameNode</code> - 
+            <a href="http://localhost:50070/">http://localhost:50070/</a>
+          </li>
+          <li>
+            <code>JobTracker</code> - 
+            <a href="http://localhost:50030/">http://localhost:50030/</a>
+          </li>
+        </ul>
+        
+        <p>
+          Copy the input files into the distributed filesystem:<br/>
+		  <code>$ bin/hadoop fs -put conf input</code>
+		</p>
+		
+        <p>
+          Run some of the examples provided:<br/>
+          <code>
+            $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
+          </code>
+        </p>
+        
+        <p>Examine the output files:</p>
+        <p>
+          Copy the output files from the distributed filesystem to the local 
+          filesytem and examine them:<br/>
+          <code>$ bin/hadoop fs -get output output</code><br/>
+          <code>$ cat output/*</code>
+        </p>
+        <p> or </p>
+        <p>
+          View the output files on the distributed filesystem:<br/>
+          <code>$ bin/hadoop fs -cat output/*</code>
+        </p>
+
+		<p>
+		  When you're done, stop the daemons with:<br/>
+		  <code>$ bin/stop-all.sh</code>
+		</p>
+      </section>
+    </section>
+    
+    <section id="FullyDistributed">
+      <title>Fully-Distributed Operation</title>
+      
+	  <p>For information on setting up fully-distributed, non-trivial clusters
+	  see <a href="cluster_setup.html">Cluster Setup</a>.</p>  
+    </section>
+    
+    <p>
+      <em>Java and JNI are trademarks or registered trademarks of 
+      Sun Microsystems, Inc. in the United States and other countries.</em>
+    </p>
+    
+  </body>
+  
+</document>

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml Thu Sep 17 23:24:38 2009
@@ -34,40 +34,16 @@
   
    <docs label="Getting Started"> 
 		<overview   				label="Overview" 					href="index.html" />
-		<quickstart 				label="Quick Start"        		href="quickstart.html" />
+		<quickstart 				label="Single Node Setup"      href="single_node_setup.html" />
 		<setup     					label="Cluster Setup"      		href="cluster_setup.html" />
-		<mapred    				label="Map/Reduce Tutorial" 	href="mapred_tutorial.html" />
   </docs>	
 		
- <docs label="Programming Guides">
-		<commands 				label="Commands"     					href="commands_manual.html" />
-		<distcp    					label="DistCp"       						href="distcp.html" />
-		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
-		<streaming 				label="Streaming"          				href="streaming.html" />
-		<fair_scheduler 			label="Fair Scheduler" 					href="fair_scheduler.html"/>
-		<cap_scheduler 		label="Capacity Scheduler" 			href="capacity_scheduler.html"/>
+ <docs label="Guides">
+		<fsshell				        label="File System Shell"               href="file_system_shell.html" />
 		<SLA					 	label="Service Level Authorization" 	href="service_level_auth.html"/>
-		<vaidya    					label="Vaidya" 								href="vaidya.html"/>
-		<archives  				label="Archives"     						href="hadoop_archives.html"/>
+		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
    </docs>
-   
-   <docs label="HDFS">
-		<hdfs_user      				label="User Guide"    							href="hdfs_user_guide.html" />
-		<hdfs_arch     				label="Architecture"  								href="hdfs_design.html" />	
-		<hdfs_fs       	 				label="File System Shell Guide"     		href="hdfs_shell.html" />
-		<hdfs_perm      				label="Permissions Guide"    					href="hdfs_permissions_guide.html" />
-		<hdfs_quotas     			label="Quotas Guide" 							href="hdfs_quota_admin_guide.html" />
-		<hdfs_SLG        			label="Synthetic Load Generator Guide"  href="SLG_user_guide.html" />
-		<hdfs_imageviewer						label="Offline Image Viewer Guide"	href="hdfs_imageviewer.html" />
-		<hdfs_libhdfs   				label="C API libhdfs"         						href="libhdfs.html" /> 
-   </docs> 
-   
-   <docs label="HOD">
-		<hod_user 	label="User Guide" 	href="hod_user_guide.html"/>
-		<hod_admin 	label="Admin Guide" 	href="hod_admin_guide.html"/>
-		<hod_config 	label="Config Guide" 	href="hod_config_guide.html"/> 
-   </docs> 
-   
+
    <docs label="Miscellaneous"> 
 		<api       	label="API Docs"           href="ext:api/index" />
 		<jdiff     	label="API Changes"      href="ext:jdiff/changes" />
@@ -78,24 +54,26 @@
    </docs> 
    
   <external-refs>
-    <site      href="http://hadoop.apache.org/core/"/>
-    <lists     href="http://hadoop.apache.org/core/mailing_lists.html"/>
-    <archive   href="http://mail-archives.apache.org/mod_mbox/hadoop-core-commits/"/>
-    <releases  href="http://hadoop.apache.org/core/releases.html">
+    <site      href="http://hadoop.apache.org/common/"/>
+    <lists     href="http://hadoop.apache.org/common/mailing_lists.html"/>
+    <archive   href="http://mail-archives.apache.org/mod_mbox/hadoop-common-commits/"/>
+    <releases  href="http://hadoop.apache.org/common/releases.html">
       <download href="#Download" />
     </releases>
-    <jira      href="http://hadoop.apache.org/core/issue_tracking.html"/>
-    <wiki      href="http://wiki.apache.org/hadoop/" />
-    <faq       href="http://wiki.apache.org/hadoop/FAQ" />
-    <hadoop-default href="http://hadoop.apache.org/core/docs/current/hadoop-default.html" />
-    <core-default href="http://hadoop.apache.org/core/docs/current/core-default.html" />
-    <hdfs-default href="http://hadoop.apache.org/core/docs/current/hdfs-default.html" />
-    <mapred-default href="http://hadoop.apache.org/core/docs/current/mapred-default.html" />
+    <jira  href="http://hadoop.apache.org/common/issue_tracking.html"/>
+    <wiki  href="http://wiki.apache.org/hadoop/Common" />
+    <faq  href="http://wiki.apache.org/hadoop/Common/FAQ" />
+    
+    <common-default href="http://hadoop.apache.org/common/docs/current/common-default.html" />
+    <hdfs-default href="http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html" />
+    <mapred-default href="http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html" />
+    
     <zlib      href="http://www.zlib.net/" />
     <gzip      href="http://www.gzip.org/" />
     <bzip      href="http://www.bzip.org/" />
     <cygwin    href="http://www.cygwin.com/" />
     <osx       href="http://www.apple.com/macosx" />
+    
     <hod href="">
       <cluster-resources href="http://www.clusterresources.com" />
       <torque href="http://www.clusterresources.com/pages/products/torque-resource-manager.php" />
@@ -109,6 +87,7 @@
       <python href="http://www.python.org" />
       <twisted-python href="http://twistedmatrix.com/trac/" />
     </hod>
+    
     <relnotes href="releasenotes.html" />
     <changes href="changes.html" />
     <jdiff href="jdiff/">

Modified: hadoop/common/trunk/src/docs/src/documentation/content/xdocs/tabs.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/content/xdocs/tabs.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/content/xdocs/tabs.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/content/xdocs/tabs.xml Thu Sep 17 23:24:38 2009
@@ -30,8 +30,8 @@
     directory (ends in '/'), in which case /index.html will be added
   -->
 
-  <tab label="Project" href="http://hadoop.apache.org/core/" />
+  <tab label="Project" href="http://hadoop.apache.org/common/" />
   <tab label="Wiki" href="http://wiki.apache.org/hadoop" />
-  <tab label="Hadoop 0.21 Documentation" dir="" />  
+  <tab label="Common 0.21 Documentation" dir="" />  
   
 </tabs>

Added: hadoop/common/trunk/src/docs/src/documentation/resources/images/common-logo.jpg
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/resources/images/common-logo.jpg?rev=816409&view=auto
==============================================================================
Binary file - no diff available.

Propchange: hadoop/common/trunk/src/docs/src/documentation/resources/images/common-logo.jpg
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Modified: hadoop/common/trunk/src/docs/src/documentation/skinconf.xml
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/src/docs/src/documentation/skinconf.xml?rev=816409&r1=816408&r2=816409&view=diff
==============================================================================
--- hadoop/common/trunk/src/docs/src/documentation/skinconf.xml (original)
+++ hadoop/common/trunk/src/docs/src/documentation/skinconf.xml Thu Sep 17 23:24:38 2009
@@ -68,7 +68,7 @@
   <project-name>Hadoop</project-name>
   <project-description>Scalable Computing Platform</project-description>
   <project-url>http://hadoop.apache.org/core/</project-url>
-  <project-logo>images/core-logo.gif</project-logo>
+  <project-logo>images/common-logo.jpg</project-logo>
 
   <!-- group logo -->
   <group-name>Hadoop</group-name>
@@ -146,11 +146,11 @@
     <!--Headers -->
 	#content h1 {
 	  margin-bottom: .5em;
-	  font-size: 200%; color: black;
+	  font-size: 185%; color: black;
 	  font-family: arial;
 	}  
-    h2, .h3 { font-size: 195%; color: black; font-family: arial; }
-	h3, .h4 { font-size: 140%; color: black; font-family: arial; margin-bottom: 0.5em; }
+    h2, .h3 { font-size: 175%; color: black; font-family: arial; }
+	h3, .h4 { font-size: 135%; color: black; font-family: arial; margin-bottom: 0.5em; }
 	h4, .h5 { font-size: 125%; color: black;  font-style: italic; font-weight: bold; font-family: arial; }
 	h5, h6 { font-size: 110%; color: #363636; font-weight: bold; } 
    



Mime
View raw message