hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bit1129@163.com" <bit1...@163.com>
Subject Re: RE: Question about shuffle/merge/sort phrase
Date Mon, 22 Dec 2014 05:20:05 GMT
Thanks Rohith,

My question regarding this is on the Reducer side, not related with Combiner( which happens
on the mapper node).

When all mappers' output key/value pairs shuffle to the reduer nodes, , three things should
be done.
1. Merge mapper' output key/value pairs from all the mapper nodes together.
2. The key/value pairs are sorted by the key
3. All the values of the same key will form an iterative collection into a format like <key,
My question is who takes this responsibiltiy to form this iterative collection?


From: Rohith Sharma K S
Date: 2014-12-22 12:23
To: user@hadoop.apache.org
Subject: RE: Question about shuffle/merge/sort phrase
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
Thanks & Regards
Rohith Sharma K S
This e-mail and its attachments contain confidential information from HUAWEI, which is intended
only for the person or entity whose address is listed above. Any use of the information contained
herein in any way (including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive
this e-mail in error, please notify the sender by phone or email immediately and delete it!
From: Todd [mailto:bit1129@163.com] 
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from
mapper node to reducer node, and merge phrase is used to merge all the mapper output from
all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together
(The reducer's input is a key and an iterative values). 

View raw message