Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7AD26F6D9 for ; Thu, 28 Mar 2013 10:02:09 +0000 (UTC) Received: (qmail 95256 invoked by uid 500); 28 Mar 2013 10:02:04 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 94832 invoked by uid 500); 28 Mar 2013 10:02:00 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 94811 invoked by uid 99); 28 Mar 2013 10:01:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 10:01:59 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hemanty@thoughtworks.com designates 64.18.0.180 as permitted sender) Received: from [64.18.0.180] (HELO exprod5og105.obsmtp.com) (64.18.0.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 10:01:52 +0000 Received: from mail-ob0-f199.google.com ([209.85.214.199]) (using TLSv1) by exprod5ob105.postini.com ([64.18.4.12]) with SMTP ID DSNKUVQU+rwkYls5hImoWZGAZpJkVidz+VeY@postini.com; Thu, 28 Mar 2013 03:01:31 PDT Received: by mail-ob0-f199.google.com with SMTP id wd20so46848417obb.2 for ; Thu, 28 Mar 2013 03:01:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:x-received:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=EyVdB0U/+eO14dHz4Xr4PFTPRKpQyUGK6cNtmW8q2l4=; b=QmE+Yt4gGxIrm12z7vTC/fd78EWZWqYzjT2Zy20u4GiJKg0yRLfaO3FmMZuFbmyXca 3NoHsxrR8h832WSrrhHe49BI7gui+LAPnv642NMIqQOsbwFkufrSYgHFsLjBcjlczW8p DLwKJPjXBp4TLwK3ueFK690ho+FimhvrKVFQaI4kwVUOohX8rjR7ilCfxgjp8QkjM75m 4SQ8NkOlOsqeizEmQityFWw5GbbtrHO+lettLslIXwaRqCMQW7myDj3QIPf7LJLP2Kq6 vh5mSqoIjNK1kaj4W4j78G5MJTxLeTMDLiEr1T5GfK41dm9nJnIcsdVyxAiF9V5ZHcOz DvFg== X-Received: by 10.60.1.225 with SMTP id 1mr16462117oep.141.1364464890202; Thu, 28 Mar 2013 03:01:30 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.1.225 with SMTP id 1mr16462116oep.141.1364464890079; Thu, 28 Mar 2013 03:01:30 -0700 (PDT) Received: by 10.76.154.136 with HTTP; Thu, 28 Mar 2013 03:01:29 -0700 (PDT) In-Reply-To: References: Date: Thu, 28 Mar 2013 15:31:29 +0530 Message-ID: Subject: Re: Find reducer for a key From: Hemanth Yamijala To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=e89a8fb1fbba14fb2004d8f93d24 X-Gm-Message-State: ALoCoQnLBqW7mdMgYNu2eVXHA4xPog3QDQ3Y/uoyfOjOI+A3ay9/tx2ht2dGDd37FKePwC5OuSEeNxW7gMd2kBTkb1WIma66/N4emGMoXgx5rYgcdWnTmhOWT2Gm8K4gZwpaUZ4rlIdnqZdDQQgT/UyMqotaemGGhQ== X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb1fbba14fb2004d8f93d24 Content-Type: text/plain; charset=ISO-8859-1 Hi, Not sure if I am answering your question, but this is the background. Every MapReduce job has a partitioner associated to it. The default partitioner is a HashPartitioner. You can as a user write your own partitioner as well and plug it into the job. The partitioner is responsible for splitting the map outputs key space among the reducers. So, to know which reducer a key will go to, it is basically the value returned by the partitioner's getPartition method. For e.g this is the code in the HashPartitioner: public int getPartition(K2 key, V2 value, int numReduceTasks) { return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks; } mapred.task.partition is the key that defines the partition number of this reducer. I guess you can piece together these bits into what you'd want.. However, I am interested in understanding why you want to know this ? Can you share some info ? Thanks Hemanth On Thu, Mar 28, 2013 at 2:17 PM, Alberto Cordioli < cordioli.alberto@gmail.com> wrote: > Hi everyone, > > how can i know the keys that are associated to a particular reducer in > the setup method? > Let's assume in the setup method to read from a file where each line > is a string that will become a key emitted from mappers. > For each of these lines I would like to know if the string will be a > key associated with the current reducer or not. > > I read something about mapred.task.partition and mapred.task.id, but I > didn't understand the usage. > > > Thanks, > Alberto > > > -- > Alberto Cordioli > --e89a8fb1fbba14fb2004d8f93d24 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

Not sure if I am answering your que= stion, but this is the background. Every MapReduce job has a partitioner as= sociated to it. The default partitioner is a HashPartitioner. You can as a = user write your own partitioner as well and plug it into the job. The parti= tioner is responsible for splitting the map outputs key space among the red= ucers.

So, to know which reducer a key will go to, it is= basically the value returned by the partitioner's getPartition method.= For e.g this is the code in the HashPartitioner:

=A0 public int getPartition(K2 key, V2 value,
=A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 int numReduceTasks) {
=A0 =A0 return (key.hashCode() & Integer.MAX_VALUE) % numReduceTa= sks;
=A0 }

mapred.task.partition is the key that defines the part= ition number of this reducer.=A0

I gue= ss you can piece together these bits into what you'd want.. However, I = am interested in understanding why you want to know this ? Can you share so= me info ?

Thanks
Hemanth
<= /div>


On Thu, = Mar 28, 2013 at 2:17 PM, Alberto Cordioli <cordioli.alberto@gmail= .com> wrote:
Hi everyone,

how can i know the keys that are associated to a particular reducer in
the setup method?
Let's assume in the setup method to read from a file where each line is a string that will become a key emitted from mappers.
For each of these lines I would like to know if the string will be a
key associated with the current reducer or not.

I read something about mapred.task.partition and mapred.task.id, but I
didn't understand the usage.


Thanks,
Alberto


--
Alberto Cordioli

--e89a8fb1fbba14fb2004d8f93d24--