incubator-crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Beech <d...@paraliatech.com>
Subject Re: Question about mapreduce job planner
Date Tue, 15 Jan 2013 21:00:26 GMT
Thanks Josh - that's great. I'll file a JIRA about the side-outputs
feature, but the pipeline.run() call will serve my purpose for now.

Cheers,
Dave

On 15 January 2013 18:03, Josh Wills <jwills@cloudera.com> wrote:

> Hey Dave,
>
> The way to force a sequential run would be to call pipeline.run() after
> you write D to HDFS and before you declare the operations in step 6. What
> we would really want here is a single MapReduce job that wrote side outputs
> on the map side to create the dataset in step D, but we don't have support
> for side-outputs in maps yet. Worth filing a JIRA, I think.
>
> Thanks!
> Josh
>

Mime
View raw message