giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Reduce memory footprint of RandomWalkVertex
Date Wed, 16 Jan 2013 16:05:34 GMT
Hi,

I'm currently working on GIRAPH-480 to add a convergence check to
RandomWalkVertex (which is an abstract version of PageRank and
RandomWalkWithRestart).

RandomWalkVertex extends LongDoubleFloatDoubleEdgeListVertex which means
that the edge values (the transition probabilities between the vertices)
are explicitly modeled. AFAIK in most cases these probabilities are
taken as uniform which means we could simply use 1 / getNumEdges() as
transition probability and save a lot of space by omitting the edge
values for each vertex. RandomWalkVertex could then simply extend
LongDoubleNullDoubleVertex.

I think this issue is pretty important, as RandomWalkVertex should be
the basis for a real-world PageRank implementation (that can deal with
dangling nodes and has a convergence check).

Best,
Sebastian

PS: It's great to see how much progress Giraph has made over the last
months!

Mime
View raw message