giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Edunov" <edu...@gmail.com>
Subject Re: Review Request 23989: Improve checkpointing
Date Tue, 12 Aug 2014 00:54:00 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23989/
-----------------------------------------------------------

(Updated Aug. 12, 2014, 12:53 a.m.)


Review request for giraph.


Changes
-------

Addressing CR issues


Repository: giraph-git


Description
-------

We need to address some issues with checkpointing:
1) worker2worker messages are not saved
2) BspServiceWorker does not compile under hadoop_0.23 profile
3) it would be nice to be able to manually checkpoint and stop any job at any point of time.

Changes:

1) worker2worker messages fixed my serializing currentworkertoworker messages (it is a list
of writable so I had to write class information as well)
2) Compilation issues fixed
3) The way you can trigger checkpointing now is by creating /_checkpointAndStop node in zookeeper
(same way as _haltComputation works) After that the behavior of the job will be determined
by registered GiraphJobRetryChecker. By default, job will get checkpointed at the end of current
superstep and halted. You can override this behavior by making shouldRestartCheckpoint() return
true, in this case job will be restarted immediately after getting checkpointed.


Diffs (updated)
-----

  giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 02577b9 
  giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedService.java ff3e427 
  giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java e5b7cf3 
  giraph-core/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java e5d0ae1 
  giraph-core/src/main/java/org/apache/giraph/bsp/CheckpointStatus.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/bsp/SuperstepState.java c384fbf 
  giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java a92cd1c 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 0424a47 
  giraph-core/src/main/java/org/apache/giraph/graph/FinishedSuperstepStats.java c351778 
  giraph-core/src/main/java/org/apache/giraph/graph/GlobalStats.java bc56c9c 
  giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java 6ebb002 
  giraph-core/src/main/java/org/apache/giraph/job/DefaultGiraphJobRetryChecker.java 0cab86c

  giraph-core/src/main/java/org/apache/giraph/job/GiraphJob.java 4a1f02e 
  giraph-core/src/main/java/org/apache/giraph/job/GiraphJobRetryChecker.java 53a800e 
  giraph-core/src/main/java/org/apache/giraph/job/HadoopUtils.java 9530fd6 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java e129390 
  giraph-core/src/main/java/org/apache/giraph/master/MasterThread.java 0635210 
  giraph-core/src/main/java/org/apache/giraph/utils/CheckpointingUtils.java PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/utils/WritableUtils.java 763f59d 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java d2d24ee 
  giraph-core/src/test/java/org/apache/giraph/utils/TestWritableUtils.java PRE-CREATION 
  giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java 2939af7 
  pom.xml ed2a98c 

Diff: https://reviews.apache.org/r/23989/diff/


Testing
-------

Run pagerank, will keep testing with different jobs.


Thanks,

Sergey Edunov


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message