giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Junghanns <martin.jungha...@gmx.net>
Subject Re: How to format Giraph input dataset
Date Wed, 11 Mar 2015 07:02:36 GMT
Hi Ralph,

you can set a vertex or edge input format when running a Giraph job.
In the example, you used the vertex input format (vif)

"-vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat"

Your wikitalk input format is an edge list and Giraph offers, e.g.,

"org.apache.giraph.io.formats.IntNullTextEdgeInputFormat"

which reads a graph where "Each line consists of: source_vertex,
target_vertex" (separated by a \t)

You can set the edge input format via the -eif parameter.

Cheers,
Martin

The package "org.apache.giraph.io.formats" in giraph-core contains a lot
more formats.

On 11.03.2015 06:37, MengXiaodong wrote:
> Hi all,
> 
> I'm new to Giraph, now I successfully ran my first example by
> following the instruction on Giraph - Quick Start. However, I met a
> question when I write my own Giraph code.
> 
> In the "quick start", The format of input graph is as following:
> 
> [0,0,[[1,1],[3,3]]] [1,0,[[0,1],[2,2],[3,1]]] [2,0,[[1,2],[4,4]]] 
> [3,0,[[0,3],[1,1],[4,4]]] [4,0,[[3,4],[2,4]]]
> 
> But the graphs (like Facebook, twitter social network) datasets
> downloaded from public websites are in various format. How can I
> transform a graph into the standard Giraph graph like the above
> one?
> 
> For example the WikiTalk graph as blow, which is a directed graph.
> Directed edge A->B means user A edited talk page of B.
> 
> # FromNodeId	ToNodeId 0	1 2	1 2	21 2	46 2	63 2	88 2	93 2	94 2	101 2
> 102 2	103 2	116 2	119 2	125
> 
> Regards, Ralph
> 

Mime
View raw message