giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MengXiaodong <mengxiaodong1...@gmail.com>
Subject Re: How to format Giraph input dataset
Date Thu, 12 Mar 2015 15:04:52 GMT
Hi Martin,

Thank you for your kindly reply. I followed your suggestion and input the command like blow:

hadoop jar giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -eif
org.apache.giraph.io.formats.IntNullTextEdgeInputFormat -eip /WikiTalk.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /outputTran -w 1

However, I got a error when I try this common:
Exception in thread "main" java.lang.IllegalArgumentException: checkClassTypes: vertex index
types not assignable, computation - class org.apache.hadoop.io.LongWritable, EdgeInputFormat
- class org.apache.hadoop.io.NullWritable
	at org.apache.giraph.job.GiraphConfigurationValidator.checkAssignable(GiraphConfigurationValidator.java:384)
	at org.apache.giraph.job.GiraphConfigurationValidator.verifyEdgeInputFormatGenericTypes(GiraphConfigurationValidator.java:242)
	at org.apache.giraph.job.GiraphConfigurationValidator.validateConfiguration(GiraphConfigurationValidator.java:142)
	at org.apache.giraph.utils.ConfigurationUtils.parseArgs(ConfigurationUtils.java:222)
	at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:74)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)



I assume that the error happens because the input format is intwritable while the example
uses longwritable as the vertex id. If so, may I ask how to transfer intwritable to longwritable?

Kindly Regards,
Ralph

> On Mar 11, 2015, at 4:02 PM, Martin Junghanns <martin.junghanns@gmx.net> wrote:
> 
> Hi Ralph,
> 
> you can set a vertex or edge input format when running a Giraph job.
> In the example, you used the vertex input format (vif)
> 
> "-vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat"
> 
> Your wikitalk input format is an edge list and Giraph offers, e.g.,
> 
> "org.apache.giraph.io.formats.IntNullTextEdgeInputFormat"
> 
> which reads a graph where "Each line consists of: source_vertex,
> target_vertex" (separated by a \t)
> 
> You can set the edge input format via the -eif parameter.
> 
> Cheers,
> Martin
> 
> The package "org.apache.giraph.io.formats" in giraph-core contains a lot
> more formats.
> 
> On 11.03.2015 06:37, MengXiaodong wrote:
>> Hi all,
>> 
>> I'm new to Giraph, now I successfully ran my first example by
>> following the instruction on Giraph - Quick Start. However, I met a
>> question when I write my own Giraph code.
>> 
>> In the "quick start", The format of input graph is as following:
>> 
>> [0,0,[[1,1],[3,3]]] [1,0,[[0,1],[2,2],[3,1]]] [2,0,[[1,2],[4,4]]] 
>> [3,0,[[0,3],[1,1],[4,4]]] [4,0,[[3,4],[2,4]]]
>> 
>> But the graphs (like Facebook, twitter social network) datasets
>> downloaded from public websites are in various format. How can I
>> transform a graph into the standard Giraph graph like the above
>> one?
>> 
>> For example the WikiTalk graph as blow, which is a directed graph.
>> Directed edge A->B means user A edited talk page of B.
>> 
>> # FromNodeId	ToNodeId 0	1 2	1 2	21 2	46 2	63 2	88 2	93 2	94 2	101 2
>> 102 2	103 2	116 2	119 2	125
>> 
>> Regards, Ralph
>> 


Mime
View raw message