hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: how to set the result of the first mapreduce program as the input of the second mapreduce program?
Date Thu, 21 Feb 2008 19:19:24 GMT
Output of every mapreduce job in Hadoop gets stored in the DFS i.e made 
visible. You can run back to back jobs (i.e job chaining) but the output 
wont be temporary. Look at Grep.java as Hairong suggested for more 
details on job chaining. As of now there is no support for job chaining 
in Hadoop. Pig []http://incubator.apache.org/pig/] on the other hand 
implicitly does job pipelining. But for smaller and simple pipelines you 
could do manual chaining. It depends on the kind of pipelining one requires.
Amar
ma qiang wrote:
> Hi all:
>      Here I have two mapreduce program.I need to use the result of the
> first mapreduce program to computer another values which generate in
> the second mapreduce program and this intermediate result is not need
> to save, 
> so I want to run the second mapreduce program automatic using
> output of the first mapreduce program as the input of the second
> mapreduce program. Who can tell me how?
>      Thanks!
>      Best Wishes!
>
> Qiang
>   


Mime
View raw message