giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Garcia <>
Subject RE: how to use SimplePageRankVertex
Date Mon, 20 Feb 2012 01:44:16 GMT
so, if that's the case, it's possible that the Tasktracker process doesn't have the job on
it's classpath.  Although you have added the jar to "a" classpath, I'm not certain that the
Tasktracker will have it.  There are several ways to address this.  1.) you could bring Hadoop
down, and then adjust to export the HADOOP_CLASSPATH environment variable to
include your jar.   This variable is commented out by default.  If you are running in distributed
mode, this means that you will have to copy this jar to ever single machine...and probably
change this script on every single machine too...unless you are using something like condor
(or puppet if you're hard core serious), this is a serious pain...and for changing MR jobs,
totally overkill.  My personal preference is to use the Distributed cache, and copy your jar
to a location in hdfs:

hope this helps.
From: yavuz gokirmak []
Sent: Sunday, February 19, 2012 2:19 AM
Subject: Re: how to use SimplePageRankVertex

I am using pseudo distribudet cluster

On 19 February 2012 02:00, David Garcia <<>>
Are you submitting this job to a pseudo distributed cluster or a fully distributed cluster?

Sent from my HTC Inspire™ 4G on AT&T

----- Reply message -----
From: "yavuz gokirmak" <<>>
To: "<>" <<>>
Subject: how to use SimplePageRankVertex
Date: Sat, Feb 18, 2012 2:04 pm

Thank you for advices,

I have a few more questions.

I have created a class named INTPageRankVertex which is similar to SimplePageRankVertex and
generated a jar holding only

Later, try to run with giraph command as below but get classpath errors:

giraph INTPageRankVertex.jar org.test.INTPageRankVertex \
-ip /user/hdfs/pagerankinput/graph.input \
-op /user/hdfs/pagerankoutput/ \
-w 1  \
-if org.test.INTPageRankVertex.INTPageRankVertexInputFormat \
-of org.test.INTPageRankVertex.INTPageRankVertexOutputFormat \

First I get,
Exception in thread "main" java.lang.ClassNotFoundException: org.test.INTPageRankVertex

in bin/giraph user jar is added to classpath on line 58

but CLASSPATH is overwritten on line 87
87.         CLASSPATH=`mvn dependency:build-classpath | grep -v "[INFO]"`

changing line 87 as below solves my first problem. Does this patch is valid?
87.         CLASSPATH=$CLASSPATH:`mvn dependency:build-classpath | grep -v "[INFO]"`

After changing line 87 I get a different classpath error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/giraph/graph/LongDoubleFloatDoubleVertex

And I solved this problem by adding below line

Does these patches are necessary or I am doing something wrong while running my code..

best regards..

On 18 February 2012 18:37, Avery Ching <<>>
IntIntNullIntTextInputFormat in the examples package (extending TextVertexInputFormat as David
suggests) is very similar to what you need I think, although the types might be different
for your application.  You can start with that perhaps.


On 2/18/12 7:48 AM, David Garcia wrote:
The easiest thing to do is to extend text vertex or/and textvertext input format and/or the
record reader.  The record reader will give you the vertices you want.  Look at the record
reader for textvertexinputformat.  It's an innerclass on this format class.

Sent from my HTC Inspire™ 4G on AT&T

----- Reply message -----
From: "yavuz gokirmak" <><>
To: ""<> <><>
Subject: how to use SimplePageRankVertex
Date: Sat, Feb 18, 2012 9:08 am


I am planning to use giraph for network analysis. First I am trying to fully understand SimplePageRankVertex
implementation and modify in order to serve my needs.

I have a question about example,
What is the expected input format for SimplePageRankVertex, I couldn't understand the input
format although  SimplePageRankVertexReader class has few lines.

My input file is contains of rows such as:
usera, userb
usera, userc
userc, usera
userb, userc
userc, userb
Each row represents a relation between two users,
"usera,userb" means that "usera is clicked userb's profile"

Is it possible to make social network analysis over such kind of data using giraph?
I will be glad if you can give advices..

thanks in advance
best regards

View raw message