giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <castagna.li...@googlemail.com>
Subject Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...
Date Mon, 28 May 2012 20:54:07 GMT
Sebastian Schelter wrote:
> However, the problem with this input is that the dangling vertices that
> don't have a line of their own (such as 11) cannot contribute their
> accumulated rank, as no vertex for them will be instantiated. So
> counting them doesn't help either.

No, the 'implicit' dangling nodes (such as 6, 7, 9 and 11 below) are
instantiated when you send a message to them. If you run the example,
you'll see that after the first superstep there are 11 vertices which
are sending and receiving messages (as it should be with correct input).

> I think that we should rely on users supplying valid input (a line for
> each vertex) and not try to correct for that in the vertex class.

Well, I don't disagree in principle.

But in practice this won't stop users making mistakes and provide your
software with bad data as input. :-) One superstep for cleaning/validating
the input data isn't that bad after all.

> Creating a line for each vertex from such a file is an easy task that is
> doable with a single MapReduce pass over the data beforehand.

Sure. (Why is this better than a superstep with Giraph?) ;-)

Paolo



Mime
View raw message