giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: pagerank in giraph.
Date Wed, 26 Feb 2014 21:04:38 GMT
Hi Suijian,

Giraph has several PageRank implementations. I suggest that you use 
org.apache.giraph.examples.PageRankComputation which will automatically 
check convergence for you and correctly handle dangling vertices 
(vertices without any outlinks).

It relies on org.apache.giraph.examples.LongDoubleNullTextInputFormat 
which expects a very simple text file. The format is one line per vertex 
with the id of the vertex followed by the ids of adjacent vertices:

src_vertex_id dest_vertex_id_1 dest_vertex_id_2 ...

See org.apache.giraph.examples.PageRankComputationTest for an example of 
how to configure it.

It needs org.apache.giraph.examples.RandomWalkWorkerContext as worker 
context and org.apache.giraph.examples.RandomWalkVertexMasterCompute as 
master compute.

Best,
Sebastian




On 02/26/2014 09:09 PM, Suijian Zhou wrote:
> Hi,
>    To load and compute the pagerank of the following graph format(common in
> social network graphs):
>
> Src_vertex_id_1 Dest_vertex_id_2 Dest_vertex_id_3 (v1->v2, v1->v3)
> Src_vertex_id_2 Dest_vertex_id_4 Dest_vertex_id_5 Dest_vertex_id_6 (v2->v4,
> v2->v5, v2->v6)
> .....
>
> Should I have to convert the above input format into the following so as to
> be compatible with giraph?
>
> [Src_vertex1_id_1, 1, [[Dest_vertex_id_2,0],[Dest_vertex_id_3,0]]]
> [Src_vertex1_id_2, 1,
> [[Dest_vertex_id_4,0],[Dest_vertex_id_5,0],[Dest_vertex_id_6,0]]]
> ......
>
> I.e, to set initial vertex values to 1 and edge values to 0? Thanks!
>
>    Best Regards,
>    Suijian
>


Mime
View raw message