mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manish Katyal <>
Subject Re: page rank algorithm?
Date Thu, 01 Jul 2010 19:16:25 GMT
I have a simple Page-rank algorithm for general purpose graphs implemented
using Python/Hadoop streaming.
It uses the simple power method. The Map-reduce algorithm is described in
One difference -- the transition probabilities along the edges are
non-uniform in my implementation.
For what's it worth, at the end of the ranking process, the code generates a
visualization of the network graph with the page-ranks for the vertices.
This file can be viewed using GUESS (
(Obviously for webscale datasets, this visualization is worthless).

I was planning on porting my code to Mahout as a good way of learning more
about Mahout.

However, if Ken is going to contribute this code, and the code is going to
be more scalable, then I can look at implementing something else -- perhaps
TextRank, SimRank...

Let me know,

- Manish

On Thu, Jul 1, 2010 at 9:24 AM, Ken Krugler <>wrote:

> On Jul 1, 2010, at 8:16am, Andrzej Bialecki wrote:
>  On 2010-06-30 21:11, Grant Ingersoll wrote:
>>> On Jun 27, 2010, at 12:10 PM, Manish Katyal wrote:
>>>  Is there an implementation of the page-rank algorithm in Mahout?
>>> No, there isn't.  However, do you mean to implement one specifically for
>>> link analysis or a general purpose one?
>> There is one in Nutch, but it's tied to the Nutch API.
> It's likely we'll be contributing one to Mahout - either based on Jimmy
> Lin's enhancements as described during Hadoop Summit on Tuesday, or we might
> try the "do it all with SVD" approach as previously proposed by Ted, and
> mentioned by Jake.
> -- Ken
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> e l a s t i c   w e b   m i n i n g

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message