hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Update of "PageRank" by thomasjungblut
Date Wed, 12 Sep 2012 12:47:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "PageRank" page has been changed by thomasjungblut:

   * Uses the PageRank algorithm described in the Google Pregel paper
   * Introduces partitioning and collective communication
-  * Lets the user submit his/her own TextFile to calculate the sites' Pagerank!
  == Usage ==
- bin/hama jar ../hama-0.4.0-examples.jar pagerank <input path> <output path>
[damping factor] [epsilon error] [tasks]
+ bin/hama jar ../hama-0.x.0-examples.jar pagerank <input path> <output path>
[damping factor] [epsilon error] [tasks]
  The default parameters for pagerank are:
@@ -39, +38 @@

  Make sure that every site's outlink can somewhere be found in the file as a key-site. Otherwise
it will result in weird NullPointerExceptions.
- Now you need to transform the text file using:
- {{{
- bin/hama jar ../hama-0.4.0-examples.jar pagerank-text2seq /tmp/input.txt /tmp/out/
- }}}
  Then you can run pagerank on it with:
- bin/hama jar ../hama-0.4.0-examples.jar pagerank /tmp/out /tmp/pagerank-output
+ bin/hama jar ../hama-0.x.0-examples.jar pagerank /tmp/input/input.txt /tmp/pagerank-output
  Note that based on what you have configured, the paths may be in HDFS or on local disk.
@@ -59, +53 @@

  All pages' rank should sum up to 1.0, otherwise the algorithm is broken.
- == Sample Adjacencylist File ==
- You can create a large pagerank input file by using the PagerankTeragen file from here:
- It is based on MapReduce and requires a running Hadoop cluster. You can create a file using
- {{{
- hadoop/bin hadoop -jar <jar containing the pagerank teragen> <number of vertices>
<number of reducers / output files> <number of edges per vertex> <output path>
- }}}
- Have fun! If you are facing problems, feel free to ask questions on the official mailing
  == Implementation ==
  For detailed questions in terms of implementation have a look at my blog.
- It describes the algorithm and focuses on the main ideas showing implementation things.
+ It describes the algorithm and focuses on the main ideas showing implementation things.

+ It contains ancient code from before Hama 0.5 where we introduced the graph API.

View raw message