crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <mkw...@gmail.com>
Subject Re: outputTargets in DistributedPipeline
Date Thu, 19 Jun 2014 21:24:34 GMT
Here is the work for idempotent plan()[1]

[1] -
https://issues.apache.org/jira/browse/CRUNCH-405?jql=project%20%3D%20CRUNCH%20AND%20reporter%20%3D%20currentUser()


On Thu, Jun 19, 2014 at 11:41 AM, Josh Wills <jwills@cloudera.com> wrote:

> Probably. We need to do some work to make plan() idempotent, IIRC, and it
> seems like the outputTargets cleanup logic should move into execute() (at
> least, verification that the outputTargets that the plan was responsible
> for were removed after the execution was complete.)
>
>
> On Thu, Jun 19, 2014 at 9:29 AM, Allan Shoup <allan.shoup@gmail.com>
> wrote:
>
> > As far as I can tell, the only way the outputTargets in
> DistributedPipeline
> > get cleaned out is in MRPipeline.runAsync. If I were to bypass runAsync
> and
> > instead directly utilize MRPipeline.plan and MRExecutor.execute, would
> > there be ill effects if I wanted to continue to use that same pipeline to
> > run additional jobs?
> >
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message