giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitaly Tsvetkoff <vi.v.tsvetk...@gmail.com>
Subject Run custom project using giraph on yarn cluster
Date Fri, 17 Jul 2015 06:21:17 GMT
Hello everyone!
Apache Giraph is the most useful framework dealing with bigdata graphs.
But I have some difficulties about how to use it in my own custom project.

Little story about what has already done. I download giraph-1.2.0 and build
the next:
*mvn clean install -DskipTests -Dcheckstyle.skip=true -Phadoop_yarn
-Dhadoop.version=2.6.0-cdh5.4.4*
Than I copy
*giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.6.0-cdh5.4.3-jar-with-dependencies.jar*
to
my cloudera yarn cluster, run some out-of-box compuations (SSSP,
SimplePageRank) and it works nice.
Now I want to use giraph in my own custom project, so I just add
dependencies like this tutorial http://www.youtube.com/watch?v=alegx3sP7hc
 advices:
        *<dependency>*
*            <groupId>org.apache.giraph</groupId>*
*            <artifactId>giraph-examples</artifactId>*
*            <version>1.1.0</version>*
*        </dependency>*
*        <dependency>*
*            <groupId>org.apache.hadoop</groupId>*
*            <artifactId>hadoop-core</artifactId>*
*            <version>1.2.1</version>*
*        </dependency>*
where *giraph-examples-1.1.0* is usually from mvn-repository. It works nice
on my local-machine. Than I build shaded-jar for cluster but It doesnot
work there! Running it like this
*hadoop jar graph-1.0.jar \*
* org.apache.giraph.GiraphRunner \*
* -Dgiraph.isStaticGraph=true \*
* -Dgiraph.useOutOfCoreGraph=true \*
* -Dgiraph.useOutOfCoreMessages=true \*
* org.apache.giraph.examples.SimplePageRankComputation \*
* -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat \*
* -vip /tmp/giraph_input/graph.json \*
* -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat \*
* -op /tmp/giraph \*
* -w 32 \*
* -mc
org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute
\*
* -yj graph-1.0.jar*
I got strange errors  like "*Unrecognized options -D"*, removing  *-D* options
just for testing I got "*java.lang.RuntimeException: class
org.apache.giraph.GiraphRunner not org.apache.giraph.graph.Computation*".
When I replace dependency to *giraph-examples-1.2.0-SNAPSHOT* it becomes
not working locally (because it was a special build for hadoop_yarn)

Please tell me the way to run giraph computations in my custom project!
Should I use *giraph-examples-1.2.0-SNAPSHOT* or *giraph-examples-1.1.0*?
Which *hadoop-core *(or *hadoop-common*?) dependency I need use for cluster
(maybe special hadoop-core from cloudera repository)?

Looking forward for your early reply!

Mime
View raw message