hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Sort" by OwenOMalley
Date Wed, 05 Jul 2006 23:36:25 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by OwenOMalley:
http://wiki.apache.org/lucene-hadoop/Sort

------------------------------------------------------------------------------
  
  == Running Sort Benchmark ==
  
- To use the sort example as a benchmark, generate 10GB/node of random data using RandomWriter.
Then sort the data using ["Sort"]. This provides a sort benchmark that scales depending on
the size of the cluster. By default, the ["Sort"] programs uses 1.0 * capacity for the number
of reduces and depending on your cluster you may see better results at 1.75 * capacity.
+ To use the sort example as a benchmark, generate 10GB/node of random data using RandomWriter.
Then sort the data using the sort example. This provides a sort benchmark that scales depending
on the size of the cluster. By default, the sort example uses 1.0 * capacity for the number
of reduces and depending on your cluster you may see better results at 1.75 * capacity.
  
+ The commands are:
+  % bin/hadoop jar hadoop-*-examples.jar randomwriter rand
+  
+  % bin/hadoop jar hadoop-*-examples.jar sort rand rand-sort
+ The first command will generate the unsorted data in the ''rand'' directory. The second
command will read that data, sort it, and write into the ''rand-sort'' directory.
+ 

Mime
View raw message