hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Chaining MapReduce Jobs
Date Thu, 08 Nov 2012 19:12:58 GMT
Have you looked at the ToolRunner class? 

On Nov 8, 2012, at 7:03 AM, Claudio Reggiani <nophiq@gmail.com> wrote:

> Hello,
> I would like to run an Hadoop program which is composed by
> Map1-Red1->Map2-Red2->Map3-Red3. I've read "Hadoop in Action" and several
> articles online, but all of them are either based on API <= 0.20 or they
> have just few lines of code.
> I'm working with Hadoop 1.0.3 and I think the best solution is to use
> JobControl class, but I haven't found one good example for that.
> In my particular application the MapReduce Jobs are executed in sequence,
> so it could be possible to run the first job, then the second and finally
> the third one. The problem is that I need to set the input and output
> directory for the second but it doesn't make sense because I should link
> the output of job1 with the input of job2 and I don't know how to do that.
> Any suggestion or resource to solve this problem? Even a source code in
> github is good.
> Thanks
> Claudio

View raw message