crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Durfey <>
Subject Cleaning up after exceptions
Date Fri, 28 Mar 2014 17:05:49 GMT
If I have a scenario where I have already called Pipeline#run (and some temporary directories
were created by Crunch during the run), and have continued on to do some additional processing
(created some new PCollection’s and specified a write location), and an exception occurs
in my code, outside of the pipeline, before Pipeline#run is called again, I would need a way
to ensure the temporary directories created in my initial run are always cleaned up. I could
call Pipeline#done, which calls cleanup() in MRPipeline, but it also calls run(). However,
I would prefer not to have run() called at all, due to the exception thrown in my code.

Would it be possible to make cleanup() public in the Pipeline interface so that can be used
to clean up any temp directories created by the pipeline?
View raw message