pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Agarwal" <abhishc...@gmail.com>
Subject Re: Review Request 39226: PIG-4680 [Pig workflows can checkpoint the state and can resume from the last successful node]
Date Fri, 27 Nov 2015 15:40:05 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39226/
-----------------------------------------------------------

(Updated Nov. 27, 2015, 3:40 p.m.)


Review request for pig and Rohini Palaniswamy.


Repository: pig-git


Description
-------

Pig scripts can have multiple ETL jobs in the DAG which may take hours to finish. In case
of transient errors, the job fails. When the job is rerun, all the nodes in Job graph will
rerun. Some of these nodes may have already run successfully. Redundant runs lead to wastage
of cluster capacity and pipeline delays.

In case of failure, we can persist the graph state. In next run, only the failed nodes and
their successors will rerun. This is of course subject to preconditions such as
         > Pig script has not changed
         > Input locations have not changed
         > Output data from previous run is intact
         > Configuration has not changed


Diffs (updated)
-----

  src/org/apache/pig/PigConfiguration.java 54959fe 
  src/org/apache/pig/PigServer.java ee52472 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
595e68c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRIntermediateDataVisitor.java
4b62112 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobRecovery.java
PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/MRJobState.java PRE-CREATION

  src/org/apache/pig/impl/PigImplConstants.java 050a243 
  src/org/apache/pig/impl/io/FileLocalizer.java f0f9b43 
  src/org/apache/pig/tools/grunt/GruntParser.java 439d087 

Diff: https://reviews.apache.org/r/39226/diff/


Testing
-------


Thanks,

Abhishek Agarwal


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message