giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <>
Subject [jira] [Updated] (GIRAPH-13) Port Giraph to YARN
Date Wed, 13 Mar 2013 23:06:15 GMT


Eli Reisman updated GIRAPH-13:

    Attachment: GIRAPH-13-9-r3.patch

OK, this is ready to go, passes mvn verify (with and without -Phadoop_yarn) and passes its
new integration tests with MiniYARNCluster.

In order to make the test cluster work, we will have to initially support 2.0.3-alpha and
up Hadoop versions only. I can attempt further backports on future a JIRA.

No more hardcoded includes, so you need -yj option on GiraphRunner and give it a comma-separated
list of jar filenames (no path) to make your job run. For instance:

mvn -Phadoop_yarn clean package

cp giraph-examples/target/giraph*-jar-with*.jar ~/hadoop/share/hadoop/giraph/

hstart # start your Hadoop-2.0.3-alpha cluster
       # AND your OWN instance of ZK on some port
       # put this in -ca giraph.zkList=... in the launch commands below if you don't use giraph-site
for this!

bin/hadoop --config etc/hadoop jar share/hadoop/giraph/giraph-examples-0.2-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -w 3 -yh
1024 -yj giraph-examples-0.2-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar -vif -of
-vip /user/ereisman/graph3milVerts -op /user/ereisman/output

the above will build the project, then transfer giraph-examples jar with deps to a folder
we are assuming is in or under a directory on the CLASSPATH, HADOOP_HOME, or at least your
working dir. Last, we run a components job (assuming we have some sample data in our HDFS
input dir, and a 2.0.3 cluster up and running)

right now all setStatus() calls go right into the task logs. So we didn't lose them, but they
are not aggregated in a web UI for us yet. logs are prefixed by task number (numbered 2 higher
than corresponding Giraph task #'s), task 1 is always our GiraphApplicationMaster.

JIRA's I will put up to relate to this: 

- create WebUI for Giraph

- add process launch to GiraphApplicationMaster for our local ZK if we chose one, put host:port
into zkList so Giraph-BSP doesn't take over and do it.

- backport to 2.0.2-alpha, or even 2.0.0 Hadoop

- lots of strange and wonderful new things are possible, we'll see about the rest as we go

> Port Giraph to YARN
> -------------------
>                 Key: GIRAPH-13
>                 URL:
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Jakob Homan
>            Assignee: Eli Reisman
>         Attachments: GIRAPH-13-1.patch, GIRAPH-13-2.patch, GIRAPH-13-3.patch, GIRAPH-13-4.patch,
GIRAPH-13-5.patch, GIRAPH-13-6.patch, GIRAPH-13-7.patch, GIRAPH-13-8.patch, GIRAPH-13-9.patch,
GIRAPH-13-9-r1.patch, GIRAPH-13-9-r2.patch, GIRAPH-13-9-r3.patch
> Now that YARN (aka MR2 aka MAPREDUCE-279) has been merged into the Hadoop trunk, we should
think about what it would take to separate out the graph processing bits of Giraph from the
MR1-specific code so as to take advantage of the less-MR centric aspects of YARN, while still
supporting both over the medium term.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message