hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad Tofel <b...@archive.org>
Subject Re: Reduce function
Date Tue, 19 Oct 2010 00:41:01 GMT
The "Partitioner" implementation used with your job should define which 
reduce target receives a given map output key.

I don't know if an existing Partitioner implementation exists which 
meets your needs, but it's not a very complex interface to develop, if 
nothing existing works for you.


On 10/18/2010 04:43 PM, Shi Yu wrote:
> How many tags you have? If you have several number of tags, you'd 
> better create a Vector class to hold those tags. And define sum 
> function to increment the values of tags. Then the value class should 
> be your new Vector class. That's better and more decent than the 
> Textpair approach.
> Shi
> On 2010-10-18 5:19, Matthew John wrote:
>> Hi all,
>> I had a small doubt regarding the reduce module. What I understand is 
>> that
>> after the shuffle / sort phase , all the records with the same key value
>> goes into a reduce function. If thats the case, what is the attribute 
>> of the
>> Writable key which ensures that all the keys go to the same reduce ?
>> I am working on a reduce side Join where I need to tag all the keys 
>> with a
>> bit which might vary but still want all those records to go into same
>> reduce. In Hadoop the Definitive Guide, pg. 235 they are using  
>> TextPair for
>> the key. But I dont understand how the keys with different tag 
>> information
>> goes into the same reduce.
>> Matthew

View raw message