hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Running jar files inside map task
Date Wed, 27 Oct 2010 09:51:28 GMT
On Wed, Oct 27, 2010 at 12:52 PM, gaurav bagga <gaur.vbagga@gmail.com> wrote:
> It would be great if you could tell or point me to an article which uses the
> output of first map reduce as input for the 2nd map reduce.

There's ChainMapper and ChainReducer that let you do [MAP+ / REDUCE
MAP*] kind of job configuration (single job).

If you wish to chain jobs (To look like MRMRMR-ish), look at
o.a.h.mapred.jobcontrol.Job at:
http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapred/jobcontrol/Job.html

Specifically, look at Job.addDependingJob.

>
> -Gaurav
>
>
>
> On Tue, Oct 26, 2010 at 7:02 PM, Kumar Harshit <hkumar.arora@gmail.com>wrote:
>
>> You can create 2nd map reduce job. The input to the mapper of 2nd Map
>> Reduce
>> job is the output of 1st Map Reduce job. This way you can tackle the issue.
>>
>> Hope it helps
>>
>> Kumar
>>
>> On Mon, Oct 25, 2010 at 1:42 PM, Ankit Gandhi <ankit.g1290@gmail.com>
>> wrote:
>>
>> > Hey,
>> > I want to know whether can I run a jar file inside a map task or not
>> > because
>> > I have to use the output of that file in my map task.
>> > I am able to run it in standalone mode but it fails in psuedo-distributed
>> > mode.
>> > Thanks in advance
>> >
>> > --
>> > Ankit Gandhi
>> > Undergraduate
>> > Center for Visual Information Technology
>> > Computer Science Engineering & Dual Degree
>> > IIIT-Hyderabad
>> >
>>
>



-- 
Harsh J
www.harshj.com

Mime
View raw message