giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman" <initialcont...@gmail.com>
Subject Re: Review Request: GIRAPH-13: Port Giraph to YARN
Date Thu, 28 Mar 2013 22:11:24 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9811/
-----------------------------------------------------------

(Updated March 28, 2013, 10:11 p.m.)


Review request for giraph.


Changes
-------

Just a rebase. Ready for review


Description (updated)
-------

Port Giraph to "pure YARN" clusters, using Hadoop MapReduce classes in our code (IO formats
etc.) but running the cluster job without any active participation by a running MapReduce
framework. This means doing some things ourselves that Hadoop used to do for us.

This patch is ready for review. There is an integration test to verify the YARN components
can run a no-op Giraph job successfully. All BSP code is covered by our MRv1 tests, which
are sufficient since once Giraph is running, it does not know or care if its running on YARN.
We use minimal munge flags, and mostly trick the internal BSP Giraph code into thinking we're
still running on MR by supplying the GiraphTaskManager with a "dummy" Mapper#Context that
tells it what it needs to know to run the job on the YARN cluster instead. This allows us
to wait on ripping apart our IO formats or other MRv1 baked-in dependencies before we're ready
to abandon MR. This also sets up a paradigm by which it will be easy to port us to other cluster
frameworks (Mesos, etc.)

My goal is to make this not only our port of YARN, but another (there aren't many) good and
well-commented example of how to run "real applications" like Giraph on YARN clusters. So
I'm hoping its clear and easy to follow on that level as well. Happy to hear feedback on that
angle as well!

Thanks! Will post a wiki page explaining a bit more about this when its all finished. This
version is still depending on Hadoop-2.0.3-alpha, but a future JIRA can bring us to 2.0.0
or higher (and trunk of course.) depending on what sort of YARN-enabled Hadoop versions we
want to support.


Diffs
-----

  checkstyle.xml 370c120 
  giraph-core/pom.xml 3580d0c 
  giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java 5bd5686 
  giraph-core/src/main/java/org/apache/giraph/bsp/BspInputFormat.java cc53271 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 963b82a 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java c5b9b93 
  giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java 57f7dff 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 404e47e 
  giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java bd30455 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 74c1f87 
  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphApplicationMaster.java PRE-CREATION

  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnTask.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/YarnUtils.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/yarn/package-info.java PRE-CREATION 
  giraph-core/src/test/java/org/apache/giraph/yarn/TestYarnJob.java PRE-CREATION 
  giraph-core/src/test/resources/capacity-scheduler.xml PRE-CREATION 
  giraph-examples/pom.xml 3b6a08c 
  pom.xml 1e321b8 

Diff: https://reviews.apache.org/r/9811/diff/


Testing (updated)
-------

Integration tests included.


Thanks,

Eli Reisman


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message