hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: manipulating key in combine phase
Date Mon, 13 Jan 2014 17:39:25 GMT
More than a solution, I'd like to know if a combiner is allowed to change
the key ? will it interfere with the mappers sort/merge ?


On Mon, Jan 13, 2014 at 3:06 PM, Devin Suiter RDX <dsuiter@rdx.com> wrote:

> Amit,
>
> Have you explored chainMapper class?
>
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com
>
>
> On Sun, Jan 12, 2014 at 7:28 PM, John Lilley <john.lilley@redpoint.net>wrote:
>
>>  Isn’t this is what you’d normally do in the Mapper?
>>
>> My understanding of the combiner is that it is like a “mapper-side
>> pre-reducer” and operates on blocks of data that have already been sorted
>> by key, so mucking with the keys doesn’t **seem** like a good idea.
>>
>> john
>>
>>
>>
>> *From:* Amit Sela [mailto:amits@infolinks.com]
>> *Sent:* Sunday, January 12, 2014 9:26 AM
>> *To:* user@hadoop.apache.org
>> *Subject:* manipulating key in combine phase
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I was wondering if it is possible to manipulate the key during combine:
>>
>>
>>
>> Say I have a mapreduce job where the key has many qualifiers.
>>
>> I would like to "split" the key into two (or more) keys if it has more
>> than, say 100 qualifiers.
>>
>> In the combiner class I would do something like:
>>
>>
>>
>> int count = 0;
>>
>> for (Writable value: values) {
>>
>>   if (++count >= 100){
>>
>>     context.write(newKey, value);
>>
>>   } else {
>>
>>     context.write(key, value);
>>
>>   }
>>
>> }
>>
>>
>>
>> where newKey is something like key+randomUUID
>>
>>
>>
>> I know that the combiner can be called "zero, once or more..." and I'm
>> getting strange results (same key written more then once) so I would be
>> glad to get some deeper insight into how the combiner works.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Amit.
>>
>
>

Mime
View raw message