hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "FrontPage" by OwenOMalley
Date Mon, 07 Aug 2006 23:51:59 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by OwenOMalley:

- Please contribute your knowledge about Hadoop here!
+ = Hadoop =
+ [http://lucene.apache.org/hadoop/ Hadoop] is a framework for managing applications across
large clusters of information in such a way that the application does not need to worry about
either reliability or locality. Hadoop uses a computational paradigm named [:HadoopMapReduce:
Map/Reduce], where the application is divided into many fragments of work, each of which may
be executed or reexecuted on any computer in the cluster. To support locality-transparency,
Hadoop stores persistent data in a distributed file system that is designed for large streaming
reads and fault tolerance.
+ The intent is to scale Hadoop up to handling thousand of computers. The current high water
marks that have been reported are:
+  * !DataNodes: 620
+  * !TaskTrackers: 500
+ Hadoop was originally built as infrastructure for the [http://lucene.apache.org/nutch/ Nutch]
project, which crawls the web and builds a search engine index for the crawled pages. Both
Hadoop and Nutch are part of the [http://lucene.apache.org/java/docs/index.html Lucene] [http://www.apache.org/
Apache] project.
  == General Information ==
   * [http://lucene.apache.org/hadoop/ Hadoop Website ]

View raw message