hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Yamijala <yhema...@thoughtworks.com>
Subject Re: Find reducer for a key
Date Thu, 28 Mar 2013 10:01:29 GMT

Not sure if I am answering your question, but this is the background. Every
MapReduce job has a partitioner associated to it. The default partitioner
is a HashPartitioner. You can as a user write your own partitioner as well
and plug it into the job. The partitioner is responsible for splitting the
map outputs key space among the reducers.

So, to know which reducer a key will go to, it is basically the value
returned by the partitioner's getPartition method. For e.g this is the code
in the HashPartitioner:

  public int getPartition(K2 key, V2 value,
                          int numReduceTasks) {
    return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;

mapred.task.partition is the key that defines the partition number of this

I guess you can piece together these bits into what you'd want.. However, I
am interested in understanding why you want to know this ? Can you share
some info ?


On Thu, Mar 28, 2013 at 2:17 PM, Alberto Cordioli <
cordioli.alberto@gmail.com> wrote:

> Hi everyone,
> how can i know the keys that are associated to a particular reducer in
> the setup method?
> Let's assume in the setup method to read from a file where each line
> is a string that will become a key emitted from mappers.
> For each of these lines I would like to know if the string will be a
> key associated with the current reducer or not.
> I read something about mapred.task.partition and mapred.task.id, but I
> didn't understand the usage.
> Thanks,
> Alberto
> --
> Alberto Cordioli

View raw message