hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From acmur...@apache.org
Subject svn commit: r588778 - in /lucene/hadoop/trunk: CHANGES.txt src/java/overview.html
Date Fri, 26 Oct 2007 21:01:26 GMT
Author: acmurthy
Date: Fri Oct 26 14:01:26 2007
New Revision: 588778

URL: http://svn.apache.org/viewvc?rev=588778&view=rev
Log:
HADOOP-2105.  Improve overview.html to clarify supported platforms, software pre-requisites
for hadoop, how to install them on various platforms and a better general description of hadoop
and it's utility. Contributed by Jim Kellerman.

Modified:
    lucene/hadoop/trunk/CHANGES.txt
    lucene/hadoop/trunk/src/java/overview.html

Modified: lucene/hadoop/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?rev=588778&r1=588777&r2=588778&view=diff
==============================================================================
--- lucene/hadoop/trunk/CHANGES.txt (original)
+++ lucene/hadoop/trunk/CHANGES.txt Fri Oct 26 14:01:26 2007
@@ -35,6 +35,11 @@
 
     HADOOP-1210.  Log counters in job history. (Owen O'Malley via ddas)
 
+    HADOOP-2105.  Improve overview.html to clarify supported platforms, 
+    software pre-requisites for hadoop, how to install them on various 
+    platforms and a better general description of hadoop and it's utility. 
+    (Jim Kellerman via acmurthy) 
+
   OPTIMIZATIONS
 
     HADOOP-1898.  Release the lock protecting the last time of the last stack

Modified: lucene/hadoop/trunk/src/java/overview.html
URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/overview.html?rev=588778&r1=588777&r2=588778&view=diff
==============================================================================
--- lucene/hadoop/trunk/src/java/overview.html (original)
+++ lucene/hadoop/trunk/src/java/overview.html Fri Oct 26 14:01:26 2007
@@ -1,3 +1,4 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
 <head>
    <title>Hadoop</title>
@@ -6,47 +7,110 @@
 
 Hadoop is a distributed computing platform.
 
-<p>Hadoop primarily consists of a distributed filesystem (DFS, in <a
-href="org/apache/hadoop/dfs/package-summary.html">org.apache.hadoop.dfs</a>)
-and an implementation of a MapReduce distributed data processor (in <a
-href="org/apache/hadoop/mapred/package-summary.html">org.apache.hadoop.mapred
-</a>).</p>
+<p>Hadoop primarily consists of the <a 
+href="org/apache/hadoop/dfs/package-summary.html">Hadoop Distributed FileSystem 
+(HDFS)</a> and an 
+implementation of the <a href="org/apache/hadoop/mapred/package-summary.html">
+Map-Reduce</a> programming paradigm.</p>
+
+
+<p>Hadoop is a software framework that lets one easily write and run applications 
+that process vast amounts of data. Here's what makes Hadoop especially useful:</p>
+<ul>
+  <li>
+    <b>Scalable</b>: Hadoop can reliably store and process petabytes.
+  </li>
+  <li>
+    <b>Economical</b>: It distributes the data and processing across clusters

+    of commonly available computers. These clusters can number into the thousands 
+    of nodes.
+  </li>
+  <li>
+    <b>Efficient</b>: By distributing the data, Hadoop can process it in parallel

+    on the nodes where the data is located. This makes it extremely rapid.
+  </li>
+  <li>
+    <b>Reliable</b>: Hadoop automatically maintains multiple copies of data and

+    automatically redeploys computing tasks based on failures.
+  </li>
+</ul>  
 
 <h2>Requirements</h2>
 
-<ol>
-  
-<li>Java 1.5.x, preferably from <a
- href="http://java.sun.com/j2se/downloads.html">Sun</a> Set
- <tt>JAVA_HOME</tt> to the root of your Java installation.</li>
-  
-<li>ssh must be installed and sshd must be running to use Hadoop's
-scripts to manage remote Hadoop daemons.  On Ubuntu, this may done
-with <br><tt>sudo apt-get install ssh</tt></li>
-  
-<li>rsync must be installed to use Hadoop's scripts to manage remote
-Hadoop installations.  On Ubuntu, this may done with <br><tt>sudo
-apt-get install rsync</tt>.</li>
-  
-<li>On Win32, <a href="http://www.cygwin.com/">cygwin</a>, for shell
-support.  To use Subversion on Win32, select the subversion package
-when you install, in the "Devel" category.  Distributed operation has
-not been well tested on Win32, so this should primarily be considered
-a development platform at this point, not a production platform.</li>
+<h3>Platforms</h3>
+
+<ul>
+  <li>
+    Hadoop was been demonstrated on GNU/Linux clusters with 2000 nodes.
+  </li>
+  <li>
+    Win32 is supported as a <i>development</i> platform. Distributed operation

+    has not been well tested on Win32, so this is not a <i>production</i> 
+    platform.
+  </li>  
+</ul>
   
+<h3>Requisite Software</h3>
+
+<ol>
+  <li>
+    Java 1.5.x, preferably from 
+    <a href="http://java.sun.com/j2se/downloads.html">Sun</a>. 
+    Set <tt>JAVA_HOME</tt> to the root of your Java installation.
+  </li>
+  <li>
+    ssh must be installed and sshd must be running to use Hadoop's
+    scripts to manage remote Hadoop daemons.
+  </li>
+  <li>
+    rsync may be installed to use Hadoop's scripts to manage remote
+    Hadoop installations.
+  </li>
 </ol>
 
+<h4>Additional requirements for Windows</h4>
+
+<ol>
+  <li>
+    <a href="http://www.cygwin.com/">Cygwin</a> - Required for shell support
in 
+    addition to the required software above.
+  </li>
+  <li>
+    Subversion - Optional, for checking-out code from the source repository.
+  </li>
+</ol>
+  
+<h3>Installing Required Software</h3>
+
+<p>If your platform does not have the required software listed above, you
+will have to install it.</p>
+
+<p>For example on Ubuntu Linux:</p>
+<p><blockquote><pre>
+$ sudo apt-get install ssh<br>
+$ sudo apt-get install rsync<br>
+</pre></blockquote></p>
+
+<p>On Windows, if you did not install the required software when you
+installed cygwin, start the cygwin installer and select the packages:</p>
+<ul>
+  <li>openssh - the "Net" category</li>
+  <li>rsync - the "Net" category</li>
+  <li>subversion (optional) - the "Devel" category</li>
+</ul>
+
 <h2>Getting Started</h2>
 
 <p>First, you need to get a copy of the Hadoop code.</p>
 
 <p>You can download a nightly build from <a
-href="http://cvs.apache.org/dist/lucene/hadoop/nightly/">http://cvs.apache.org/dist/lucene/hadoop/nightly/</a>.
-Unpack the release and connect to its top-level directory.</p>
+href="http://cvs.apache.org/dist/lucene/hadoop/nightly/">
+http://cvs.apache.org/dist/lucene/hadoop/nightly/</a>. Unpack the release and 
+connect to its top-level directory.</p>
 
 <p>Or, check out the code from <a
 href="http://lucene.apache.org/hadoop/version_control.html">subversion</a>
-and build it with <a href="http://ant.apache.org/">Ant</a>.</p>
+and build it with <a href="http://ant.apache.org/">ant</a>.</p>
 
 <p>Edit the file <tt>conf/hadoop-env.sh</tt> to define at least
 <tt>JAVA_HOME</tt>.</p>



Mime
View raw message