hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From acmur...@apache.org
Subject svn commit: r589555 - in /lucene/hadoop/branches/branch-0.15: CHANGES.txt src/java/overview.html
Date Mon, 29 Oct 2007 09:18:07 GMT
Author: acmurthy
Date: Mon Oct 29 02:18:06 2007
New Revision: 589555

URL: http://svn.apache.org/viewvc?rev=589555&view=rev
Merge -r 588777:588778 and 589551:589552 from trunk to branch-0.15 to fix HADOOP-2105.


Modified: lucene/hadoop/branches/branch-0.15/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/CHANGES.txt?rev=589555&r1=589554&r2=589555&view=diff
--- lucene/hadoop/branches/branch-0.15/CHANGES.txt (original)
+++ lucene/hadoop/branches/branch-0.15/CHANGES.txt Mon Oct 29 02:18:06 2007
@@ -457,6 +457,11 @@
     HADOOP-2046.  Improve mapred javadoc.  (Arun C. Murthy via cutting)
+    HADOOP-2105.  Improve overview.html to clarify supported platforms, 
+    software pre-requisites for hadoop, how to install them on various 
+    platforms and a better general description of hadoop and it's utility. 
+    (Jim Kellerman via acmurthy) 
 Release 0.14.3 - 2007-10-19

Modified: lucene/hadoop/branches/branch-0.15/src/java/overview.html
URL: http://svn.apache.org/viewvc/lucene/hadoop/branches/branch-0.15/src/java/overview.html?rev=589555&r1=589554&r2=589555&view=diff
--- lucene/hadoop/branches/branch-0.15/src/java/overview.html (original)
+++ lucene/hadoop/branches/branch-0.15/src/java/overview.html Mon Oct 29 02:18:06 2007
@@ -1,3 +1,4 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
@@ -6,47 +7,110 @@
 Hadoop is a distributed computing platform.
-<p>Hadoop primarily consists of a distributed filesystem (DFS, in <a
-and an implementation of a MapReduce distributed data processor (in <a
+<p>Hadoop primarily consists of the <a 
+href="org/apache/hadoop/dfs/package-summary.html">Hadoop Distributed FileSystem 
+(HDFS)</a> and an 
+implementation of the <a href="org/apache/hadoop/mapred/package-summary.html">
+Map-Reduce</a> programming paradigm.</p>
+<p>Hadoop is a software framework that lets one easily write and run applications 
+that process vast amounts of data. Here's what makes Hadoop especially useful:</p>
+  <li>
+    <b>Scalable</b>: Hadoop can reliably store and process petabytes.
+  </li>
+  <li>
+    <b>Economical</b>: It distributes the data and processing across clusters

+    of commonly available computers. These clusters can number into the thousands 
+    of nodes.
+  </li>
+  <li>
+    <b>Efficient</b>: By distributing the data, Hadoop can process it in parallel

+    on the nodes where the data is located. This makes it extremely rapid.
+  </li>
+  <li>
+    <b>Reliable</b>: Hadoop automatically maintains multiple copies of data and

+    automatically redeploys computing tasks based on failures.
+  </li>
-<li>Java 1.5.x, preferably from <a
- href="http://java.sun.com/j2se/downloads.html">Sun</a> Set
- <tt>JAVA_HOME</tt> to the root of your Java installation.</li>
-<li>ssh must be installed and sshd must be running to use Hadoop's
-scripts to manage remote Hadoop daemons.  On Ubuntu, this may done
-with <br><tt>sudo apt-get install ssh</tt></li>
-<li>rsync must be installed to use Hadoop's scripts to manage remote
-Hadoop installations.  On Ubuntu, this may done with <br><tt>sudo
-apt-get install rsync</tt>.</li>
-<li>On Win32, <a href="http://www.cygwin.com/">cygwin</a>, for shell
-support.  To use Subversion on Win32, select the subversion package
-when you install, in the "Devel" category.  Distributed operation has
-not been well tested on Win32, so this should primarily be considered
-a development platform at this point, not a production platform.</li>
+  <li>
+    Hadoop was been demonstrated on GNU/Linux clusters with 2000 nodes.
+  </li>
+  <li>
+    Win32 is supported as a <i>development</i> platform. Distributed operation

+    has not been well tested on Win32, so this is not a <i>production</i> 
+    platform.
+  </li>  
+<h3>Requisite Software</h3>
+  <li>
+    Java 1.5.x, preferably from 
+    <a href="http://java.sun.com/j2se/downloads.html">Sun</a>. 
+    Set <tt>JAVA_HOME</tt> to the root of your Java installation.
+  </li>
+  <li>
+    ssh must be installed and sshd must be running to use Hadoop's
+    scripts to manage remote Hadoop daemons.
+  </li>
+  <li>
+    rsync may be installed to use Hadoop's scripts to manage remote
+    Hadoop installations.
+  </li>
+<h4>Additional requirements for Windows</h4>
+  <li>
+    <a href="http://www.cygwin.com/">Cygwin</a> - Required for shell support
+    addition to the required software above.
+  </li>
+  <li>
+    Subversion - Optional, for checking-out code from the source repository.
+  </li>
+<h3>Installing Required Software</h3>
+<p>If your platform does not have the required software listed above, you
+will have to install it.</p>
+<p>For example on Ubuntu Linux:</p>
+$ sudo apt-get install ssh<br>
+$ sudo apt-get install rsync<br>
+<p>On Windows, if you did not install the required software when you
+installed cygwin, start the cygwin installer and select the packages:</p>
+  <li>openssh - the "Net" category</li>
+  <li>rsync - the "Net" category</li>
+  <li>subversion (optional) - the "Devel" category</li>
 <h2>Getting Started</h2>
 <p>First, you need to get a copy of the Hadoop code.</p>
 <p>You can download a nightly build from <a
-Unpack the release and connect to its top-level directory.</p>
+http://cvs.apache.org/dist/lucene/hadoop/nightly/</a>. Unpack the release and 
+connect to its top-level directory.</p>
 <p>Or, check out the code from <a
-and build it with <a href="http://ant.apache.org/">Ant</a>.</p>
+and build it with <a href="http://ant.apache.org/">ant</a>.</p>
 <p>Edit the file <tt>conf/hadoop-env.sh</tt> to define at least

View raw message