Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0BACAD1EC for ; Wed, 13 Mar 2013 09:12:57 +0000 (UTC) Received: (qmail 73675 invoked by uid 500); 13 Mar 2013 09:12:47 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 73564 invoked by uid 500); 13 Mar 2013 09:12:47 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 73554 invoked by uid 99); 13 Mar 2013 09:12:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Mar 2013 09:12:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of onuchinart@gmail.com designates 209.85.210.43 as permitted sender) Received: from [209.85.210.43] (HELO mail-da0-f43.google.com) (209.85.210.43) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Mar 2013 09:12:40 +0000 Received: by mail-da0-f43.google.com with SMTP id u36so331387dak.2 for ; Wed, 13 Mar 2013 02:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=wcM7ugms0mfb7KciZcy1cnpuoyiCj7FuPXFCIrahVmE=; b=o0ppd4nFKzYBOmTgFJ+Serm2G5nVxDeFULfERLfTCDjh/ecifmVoWADUc60aM68iT1 6WIAkFRDSrh9Uf+ys2t9o8AtA3czRj1BWqqfmDRUwzHV04DlCKXuUNhBlT+NIcRFe53u OA9Y3p5wB1fEsF8dxOwNoLT6fZNZNe3NQxJBY94n4aN8O58zVe2gZ7AK1UFBZ4E2pKqX +guVFeAxXB86rFhlt3Jg21Yp3J97MGuGKDpiW1zb8Xf2drh4xFwi514NWh4ZngSW9U4a u/LYfIL7BnRaRqMnLxSE9FYjm04fUOsq7N2ebXIN5eOjCwHs+wtaszy4Xw+PEQxW0fHv 5HsA== MIME-Version: 1.0 X-Received: by 10.68.201.194 with SMTP id kc2mr43758888pbc.27.1363165939298; Wed, 13 Mar 2013 02:12:19 -0700 (PDT) Received: by 10.70.12.194 with HTTP; Wed, 13 Mar 2013 02:12:19 -0700 (PDT) In-Reply-To: References: Date: Wed, 13 Mar 2013 13:12:19 +0400 Message-ID: Subject: Re: How to shuffle (Key,Value) pair from mapper to multiple reducer From: Artem Onuchin To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=e89a8fb208a295fac004d7cacd0d X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb208a295fac004d7cacd0d Content-Type: text/plain; charset=ISO-8859-1 Hello Vikas! Well you can duplicate your pair in mapper for each reducer, add reducer's mark to keys and write a partitioner that will partition records acording this marks. I mean something like this: In mapper you produce (key_r1 value) (key_r2 value) instead (key value) But i cannot imagine why you need that. WBR, Onuchin Artem 2013/3/13 Viral Bajaria > Do you want the pair to go to both reducers or do you want it to go to > only one but in a random fashion ? > > AFAIK, 1st is not possible. Someone on the list can correct if I am wrong. > 2nd is possible by just implementing your own partitioner which randomizes > where each key goes (not sure what you gain by that). > > > On Wed, Mar 13, 2013 at 1:59 AM, Vikas Jadhav wrote: > >> >> Hi >> I am specifying requirement again with example. >> >> >> >> I have use case where i need to shufffle same (key,value) pair to >> multiple reducers >> >> >> For Example we have pair (1,"ABC") and two reducers (reducer0 and >> reducer1) are there then >> >> by default this pair will go to reduce1 (cause (key % numOfReducer) = >> (1%2) ) >> >> >> how i should shuffle this pair to both reducer. >> >> Also I willing to change the code of hadoop framework if Necessory. >> >> Thank you >> >> On Wed, Mar 13, 2013 at 12:51 PM, feng lu wrote: >> >>> Hi >>> >>> you can use Job#setNumReduceTasks(int tasks) method to set the number of >>> reducer to output. >>> >>> >>> On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav wrote: >>> >>>> Hello, >>>> >>>> As by default Hadoop framework can shuffle (key,value) pair to only one >>>> reducer >>>> >>>> I have use case where i need to shufffle same (key,value) pair to >>>> multiple reducers >>>> >>>> Also I willing to change the code of hadoop framework if Necessory. >>>> >>>> >>>> Thank you >>>> >>>> -- >>>> * >>>> * >>>> * >>>> >>>> Thanx and Regards* >>>> * Vikas Jadhav* >>>> >>> >>> >>> >>> -- >>> Don't Grow Old, Grow Up... :-) >>> >> >> >> >> -- >> * >> * >> * >> >> Thanx and Regards* >> * Vikas Jadhav* >> > > --e89a8fb208a295fac004d7cacd0d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hello Vikas!

Well you can duplicate your pair in m= apper for each reducer, add reducer's mark to keys and write a partitio= ner that will partition records acording this marks.

I mean something like this:
In mapper you produce (key_r1 value) = (key_r2 value) instead (key value)

But i cannot im= agine why you need that.

WBR, Onuchin Artem



2013/3/13 Viral Baja= ria <viral.bajaria@gmail.com>
Do you want the pair to go to both reducers or do you want it to go to only= one but in a random fashion ?

AFAIK, 1st is not possibl= e. Someone on the list can correct if I am wrong.
2nd is possible= by just implementing your own partitioner which randomizes where each key = goes (not sure what you gain by that).


On Wed, Mar 13, 2013 at 1:59 AM, Vikas = Jadhav <vikascjadhav87@gmail.com> wrote:
=A0
Hi
I am specifying requirement again with ex= ample.
=A0
=A0
=A0
I have use = case where i need to shufffle same (key,value) pair to multiple reducers
=A0
=A0
For Example=A0 we have pair=A0 (1,"ABC")= and two reducers (reducer0 and reducer1)=A0are there then
=A0
by default this pair will go to reduce1 (cause=A0 (key % numOfRedu= cer) =3D (1%2) )
=A0
=A0
how i should shuffle this pair to both red= ucer.
=A0
Also I willing to change th= e code of hadoop framework if Necessory.
=A0
Thank you

On = Wed, Mar 13, 2013 at 12:51 PM, feng lu <amuseme.lu@gmail.com> wrote:
Hi

you can use Job#setNumRed= uceTasks(int tasks) method to set the number of reducer to output.


On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav <vikascjadhav87@gmail= .com> wrote:
Hello,
=A0
As by default Hadoop framework can shuf= fle (key,value) pair to only one reducer
=A0
I have us= e case where i need to shufffle same (key,value) pair to multiple reducers<= /div>
=A0
Also I=A0 willing to change the code of hadoop framework= if Necessory.
=A0
=A0
Thank you

--


Thanx and Regards
=A0Vikas Jadhav



--
Don't Grow Old, Grow Up... = :-)



--


Thanx and Regards
=A0Vikas Jadhav


--e89a8fb208a295fac004d7cacd0d--