hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alberto Cordioli <cordioli.albe...@gmail.com>
Subject Re: Find reducer for a key
Date Thu, 28 Mar 2013 11:50:38 GMT
Hi Hemanth,

thanks for your reply.
Yes, this partially answered to my question. I know how hash
partitioner works and I guessed something similar.
The piece that I missed was that mapred.task.partition returns the
partition number of the reducer.
So, putting al the pieces together I undersand that: for each key in
the file I have to call the HashPartitioner.
Then I have to compare the returned index with the one retrieved by
If it is equal then such a key will be served by that reducer. Is this correct?

To answer to your question:
In a reduce side of a MR job, I want to load from file some data in a
in-memory structure. Actually, I don't need to store the whole file
for each reducer, but only the lines that are related to such keys a
particular reducers will receive.
So, my intention is to know the keys in the setup method to store only
the needed lines.


On 28 March 2013 11:01, Hemanth Yamijala <yhemanth@thoughtworks.com> wrote:
> Hi,
> Not sure if I am answering your question, but this is the background. Every
> MapReduce job has a partitioner associated to it. The default partitioner is
> a HashPartitioner. You can as a user write your own partitioner as well and
> plug it into the job. The partitioner is responsible for splitting the map
> outputs key space among the reducers.
> So, to know which reducer a key will go to, it is basically the value
> returned by the partitioner's getPartition method. For e.g this is the code
> in the HashPartitioner:
>   public int getPartition(K2 key, V2 value,
>                           int numReduceTasks) {
>     return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
>   }
> mapred.task.partition is the key that defines the partition number of this
> reducer.
> I guess you can piece together these bits into what you'd want.. However, I
> am interested in understanding why you want to know this ? Can you share
> some info ?
> Thanks
> Hemanth
> On Thu, Mar 28, 2013 at 2:17 PM, Alberto Cordioli
> <cordioli.alberto@gmail.com> wrote:
>> Hi everyone,
>> how can i know the keys that are associated to a particular reducer in
>> the setup method?
>> Let's assume in the setup method to read from a file where each line
>> is a string that will become a key emitted from mappers.
>> For each of these lines I would like to know if the string will be a
>> key associated with the current reducer or not.
>> I read something about mapred.task.partition and mapred.task.id, but I
>> didn't understand the usage.
>> Thanks,
>> Alberto
>> --
>> Alberto Cordioli

Alberto Cordioli

View raw message