Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CEBB1CBF0 for ; Mon, 22 Dec 2014 05:42:00 +0000 (UTC) Received: (qmail 51450 invoked by uid 500); 22 Dec 2014 05:41:51 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 51318 invoked by uid 500); 22 Dec 2014 05:41:51 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 51308 invoked by uid 99); 22 Dec 2014 05:41:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Dec 2014 05:41:50 +0000 X-ASF-Spam-Status: No, hits=0.3 required=5.0 tests=FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of skgadalay@gmail.com designates 209.85.212.181 as permitted sender) Received: from [209.85.212.181] (HELO mail-wi0-f181.google.com) (209.85.212.181) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Dec 2014 05:41:24 +0000 Received: by mail-wi0-f181.google.com with SMTP id r20so6855308wiv.14 for ; Sun, 21 Dec 2014 21:40:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=U/gCpR/AgCSYe5ISaYZcRMdbPR9wZneLRUtyZLYuQmE=; b=Wbd+3ec1APplv0TdF50ioCSkIVeoz0IhQxcHDlqsX6m/WVt7pnlv1SsTtjTLPJp1nD ORcCX7b6BDSpTha+f0CWBfgEQpgahOLaUhAVwNyMWuMViest7vym/tuNXjwthirDZbot cieRZjam/uKuB+m+nAaQJSDRhniSlMPdjIeCaeVH0coDqPhPjoDZQfBFeu1hqIZphKy/ GbQEm6/HsIGNUd40JOCskac2pKzAkk7XGm6C4piBEQnoCDydApPXXYpCoL4lGCW3F11Q w4Wa/MLH2Gq6AG+RZAJSJ4XrXUuBKrtL4Gj3HMKG7mZI9MqUj7R2wUoHCEj2zLzellNU j2FA== MIME-Version: 1.0 X-Received: by 10.180.207.10 with SMTP id ls10mr28000765wic.7.1419226838427; Sun, 21 Dec 2014 21:40:38 -0800 (PST) Received: by 10.217.191.202 with HTTP; Sun, 21 Dec 2014 21:40:38 -0800 (PST) In-Reply-To: <2014122213222405278413@163.com> References: <43cab251.4652.14a6d24f932.Coremail.bit1129@163.com> <201412221148321728172@163.com> <2014122213222405278413@163.com> Date: Mon, 22 Dec 2014 11:10:38 +0530 Message-ID: Subject: Re: Re: Question about shuffle/merge/sort phrase From: Susheel Kumar Gadalay To: user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org What I explained is shuffle phase. After the reducer pulls the data, it does a sort on the key part only and calls the corresponding reduce method. On 12/22/14, bit1129@163.com wrote: > Then what exactly happens after Reducer pulls all mapper output key/value > pairs from all the mapper nodes before reducer see the > ? > > > > bit1129@163.com > > From: Susheel Kumar Gadalay > Date: 2014-12-22 13:20 > To: user > Subject: Re: Question about shuffle/merge/sort phrase > Sorry, typo > > It is the reducer which will pull the mapper o/p as soon as it completes. > > On 12/22/14, Susheel Kumar Gadalay wrote: >> It is the mapper which will push the o/p to the respective reducer as >> soon as it completes. >> >> The no of reducers are known at the beginning itself. >> The mapper as it process the input split, generate the o/p of for each >> reducer (if the mapper o/p key is eligible for the reducer). >> The reducer will wait till the completion of all map tasks to start it >> processing. >> >> >> On 12/22/14, bit1129@163.com wrote: >>> Could some one help me on this question? thanks. >>> >>> >>> >>> bit1129@163.com >>> >>> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9A Todd >>> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A 2014-12-21 21:59 >>> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9A user@hadoop.apache.org >>> =E4=B8=BB=E9=A2=98=EF=BC=9A Question about shuffle/merge/sort phrase >>> Hi, Hadoopers, >>> I got a question about shuffle/sort/merge phrase related.. >>> My understanding is that shuffle is used to transfer the mapper >>> output(key/value pairs) from mapper node to reducer node, and merge >>> phrase >>> is used to merge all the mapper output from all mapper nodes, and sort >>> phrase is used to sort the key/value pair by key, >>> Then my question, whose responsibility is it that brings each key with >>> all >>> its values together (The reducer's input is a key and an iterative >>> values). >>> >>> >>> Thanks. >>> >> >