hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Reducer called twice for same key
Date Mon, 29 Jun 2015 13:47:59 GMT
Ravikant,

How is the output that you sent in the email maps to the one you are are
printing in the code (using SOP statements)?

Where do you see reducer being called again for the same key? Maybe, I am
missing something but the output statements in the code look different.

Regards,
Shahab

On Mon, Jun 29, 2015 at 2:10 AM, Ravikant Dindokar <ravikant.iisc@gmail.com>
wrote:

> Hi Harshit,
>
> PFA
>
> Thanks
> Ravikant
>
> On Mon, Jun 29, 2015 at 11:31 AM, Harshit Mathur <mathursharp@gmail.com>
> wrote:
>
>> Can you share PALReducer also?
>>
>> On Mon, Jun 29, 2015 at 11:21 AM, Ravikant Dindokar <
>> ravikant.iisc@gmail.com> wrote:
>>
>>> Adding source code for more clarity
>>>
>>> Problem statement is simple
>>>
>>> PartitionFileMapper : it takes input file which has tab separated value
>>> V , P
>>> It emits (V, -1#P)
>>>
>>> ALFileMapper : It takes input file which has tab separated values V, EL
>>> It emits (V, E#-1)
>>>
>>> in reducer I want to emit
>>> (V,E#P)
>>>
>>> Thanks
>>> Ravikant
>>>
>>> On Mon, Jun 29, 2015 at 11:04 AM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>> By custom key, did you meant some class object ? then no.
>>>>
>>>> I have two map methods each having different file as input. And both
>>>> map methods emit *Longwritable key* type. But As in stdout of
>>>> container file I can see,
>>>>
>>>> key & value separated by ':'
>>>>
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*:-1#11
>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:*391*
>>>> :3278620528725786624:5352454#-1
>>>>
>>>> for key 391 reducer is called twice. , one for value from first map
>>>> while one for value from other map.
>>>>
>>>> In map method I parse the string from input file as Long variable and
>>>> then emit it as LongWritable.
>>>>
>>>> Is there something I am missing when I use multipleInput
>>>> (org.apache.hadoop.mapreduce.lib.input.MultipleInputs)?
>>>>
>>>> Thanks
>>>> Ravikant
>>>>
>>>> On Mon, Jun 29, 2015 at 9:22 AM, Harshit Mathur <mathursharp@gmail.com>
>>>> wrote:
>>>>
>>>>> As per Map Reduce, it is not possible that two different reducers will
>>>>> get same keys.
>>>>> I think you have created some custom key type? If that is the case
>>>>> then there should be some issue with the comparator.
>>>>>
>>>>> On Mon, Jun 29, 2015 at 12:40 AM, Ravikant Dindokar <
>>>>> ravikant.iisc@gmail.com> wrote:
>>>>>
>>>>>> Hi Hadoop user,
>>>>>>
>>>>>> I have two map classes processing two different input files. Both
map
>>>>>> functions have same key,value format to emit.
>>>>>>
>>>>>> But Reducer called twice for same key , one for value from first
map
>>>>>> while one for value from other map.
>>>>>>
>>>>>> I am printing (key ,value) pairs in reducer  :
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:-1#11
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:391:3278620528725786624:5352454#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:3278620528725852160:4194699#-1
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:591:-1#13
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:-1#19
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:2391:3278620528725917696:5283986#-1
>>>>>>
>>>>>> ./container_1435326857837_0036_01_000102/stdout:Reduce:3291:3278620528725983232:4973087#-1
>>>>>>
>>>>>> both maps emit Longwritable key and Text value.
>>>>>>
>>>>>>
>>>>>> Any idea why this is happening?
>>>>>> Is there any way to get hash values generated by hadoop for keys
>>>>>> emitted by mapper?
>>>>>>
>>>>>> Thanks
>>>>>> Ravikant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harshit Mathur
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Harshit Mathur
>>
>
>

Mime
View raw message