hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: How to set 2mappers on 1 job
Date Sun, 23 Sep 2012 13:31:12 GMT
Harsh's solution is indeed cleaner and must be what you were looking for
(and there is a version for both mapred and mapreduce packages).

If you are curious, see :
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/DelegatingMapper.java

https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/DelegatingMapper.java

Regards

Bertrand


On Sun, Sep 23, 2012 at 7:37 AM, Harsh J <harsh@cloudera.com> wrote:

> Hi,
>
> There's an easier way to do what Bertrand has suggested. Look at
> MultipleInputs class:
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland
see this blog post for an example on how to use it:
> http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
>
> Note though that the reducer input key and value types are singular, and
> you need to ensure that. There's no easy way around that aside of using
> generic containers.
>
>
> On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>>
>> The confusion i have is, if i put like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>>  ---it will execute Map1.class and then Reduce1.class.
>>
>> so if i have Mapper1a.class and Mapper2a.class, how should i write the
>> code of job to execute both and then execute Reducer.class such that,
>> Reducer will take both mappers (1a, 1b) emit outputs...
>>
>> thanks
>> kumudu
>>
>> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <dechouxb@gmail.com>wrote:
>>
>>> You can use the map.input.file property to decide which logic should
>>> your mapper apply.
>>> Regards
>>> Bertrand
>>>
>>>
>>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>>> kumuduharshani@gmail.com> wrote:
>>>
>>>> Hi...
>>>> Could someone help me with following scenario..
>>>>
>>>> I want implement a job which should get 2 mapper outputs and send them
>>>> to 1 reducer. Attached image show the flow I wanted....
>>>>
>>>>
>>>>
>>>>
>>>> Normal flow is like:
>>>>
>>>> JobConf conf2 = new JobConf(WordCount.class);
>>>> Job job2 = new Job(conf2);
>>>> conf2.setOutputKeyClass(IntWritable.class);
>>>> conf2.setOutputValueClass(Text.class);
>>>>
>>>> conf2.setMapperClass(Map1.class);
>>>> conf2.setReducerClass(Reduce1.class);
>>>>
>>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>>> maps(mapper1a, mapper1b) and 1 reducer...
>>>> Is that possible, if so could someone please help..
>>>>
>>>> thanks
>>>> kumudu
>>>> --
>>>>
>>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>>> www.microsoft.com/enterprisesearch
>>>>
>>>>
>>>
>>>
>>> --
>>> Bertrand Dechoux
>>>
>>
>>
>>
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Harsh J
>



-- 
Bertrand Dechoux

Mime
View raw message