giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paolo Castagna (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-191) Random Walk with Restart
Date Fri, 18 May 2012 10:50:08 GMT


Paolo Castagna commented on GIRAPH-191:

It would be good if we could also use this or provide a PageRank implementation which deals
with dangling nodes/vertexes properly.

Dangling vertexes are vertexes with no edges.

SinglePageRankVertex has:

    if (getSuperstep() < MAX_SUPERSTEPS) {
      long edges = getNumOutEdges();
          new DoubleWritable(getVertexValue().get() / edges));
    } else {

This does not work when getNumOutEdges() returns 0.

Some suggest to divide the PageRank scores of dangling vertexes evenly among all other vertex
(it's yet another sort of random jump to propagate PageRank scores to all nodes). This can
be implemented in Giraph as a separate superstep using a SumAggregator.

Discussion on the giraph-user mailing list with further comments and references is here:

> Random Walk with Restart
> ------------------------
>                 Key: GIRAPH-191
>                 URL:
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-191.patch
> Implementing RWR on Giraph should be a very simple modification of the SimplePageRankVertex
> {code}
> if ( myID == sourceID )
>       DoubleWritable vertexValue = new DoubleWritable((0.15f + 0.85f * sum);
> else
>       DoubleWritable vertexValue = new DoubleWritable(0.85f * sum);
> {code}
> It would be nice to make it as configurable as possible by using parametric damping factors,
preference vectors, strongly preferential, etc...
> More or less along these lines:

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message