crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: outputTargets in DistributedPipeline
Date Thu, 19 Jun 2014 16:41:36 GMT
Probably. We need to do some work to make plan() idempotent, IIRC, and it
seems like the outputTargets cleanup logic should move into execute() (at
least, verification that the outputTargets that the plan was responsible
for were removed after the execution was complete.)

On Thu, Jun 19, 2014 at 9:29 AM, Allan Shoup <> wrote:

> As far as I can tell, the only way the outputTargets in DistributedPipeline
> get cleaned out is in MRPipeline.runAsync. If I were to bypass runAsync and
> instead directly utilize MRPipeline.plan and MRExecutor.execute, would
> there be ill effects if I wanted to continue to use that same pipeline to
> run additional jobs?

Director of Data Science
Cloudera <>
Twitter: @josh_wills <>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message