Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 93321 invoked from network); 22 Aug 2008 23:17:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Aug 2008 23:17:02 -0000 Received: (qmail 7949 invoked by uid 500); 22 Aug 2008 23:16:56 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 7907 invoked by uid 500); 22 Aug 2008 23:16:56 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 7894 invoked by uid 99); 22 Aug 2008 23:16:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2008 16:16:55 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of klzhao@gmail.com designates 66.249.82.237 as permitted sender) Received: from [66.249.82.237] (HELO wx-out-0506.google.com) (66.249.82.237) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2008 23:15:59 +0000 Received: by wx-out-0506.google.com with SMTP id i30so289288wxd.2 for ; Fri, 22 Aug 2008 16:16:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=7MxNFxH1l7AMeN9wvEwwNBWqXPstFKdQ+mVyG2Vy2ZU=; b=kx+7r0BgzshPSUd6oVkgQdDsDKSPN76zeBWeBcMKnVRedtvDlrJQIejGkB6qn9pdVa 6hUT/G1/buwtqaDSVDE60Gcz+0wS/k7FYxdjpzB3v/H6j6yVFIUL0NTjSUJk2mOGwS7P 7EcA/n8J9LRMvMzJg9ywPkldnxOAqvd8JRgSA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=ZNbrJH0RNj3QtsgxZfArpS25usb/yhr1S+WIXjh6ikPXjuRAviOPMtKk3g3iPI7XHH Qhz8O18/0wORDMwas6bmluwP3v2U3MSbkTbVPPil9Xrl8WcujHsPQQEdoUeN4PVM7L8F sniJh7nemNyZ9IbRcuNP3Qf6AfVrUSslFSLI8= Received: by 10.70.57.10 with SMTP id f10mr2041284wxa.20.1219446987860; Fri, 22 Aug 2008 16:16:27 -0700 (PDT) Received: by 10.70.76.17 with HTTP; Fri, 22 Aug 2008 16:16:27 -0700 (PDT) Message-ID: Date: Fri, 22 Aug 2008 16:16:27 -0700 From: Kevin To: core-user@hadoop.apache.org Subject: Re: Question about distributed sort In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: X-Virus-Checked: Checked by ClamAV on apache.org For the same key, reducer is called only once. -Kevin On Fri, Aug 22, 2008 at 4:06 PM, Alex Holmes wrote: > If this is the case, can the same reducer be invoked multiple times > with the same key? And if so, would this imply that the key could > appear on multiple lines of the reducer output file? > > Thanks, > Alex > > On Fri, Aug 22, 2008 at 7:02 PM, Kevin wrote: >> IIRC, the same key will always be sent to the same reducer. >> -Kevin >> >> >> >> On Fri, Aug 22, 2008 at 4:00 PM, Alex Holmes wrote: >>> Hi, >>> >>> For a given input key, K, in a reduce task, does Hadoop guarantee that >>> all mapper-emitted values for key K are available in the iterator? Is >>> it possible that multiple reduce tasks can receive the same key? >>> >>> Or to phrase the question in another way, for a single map-reduce job, >>> where you have multiple mapper and multiple reducer tasks, is there a >>> possibility that the same key appears in multiple reduce output files >>> (assuming the reducer only emits a single output K,V pair, where the >>> output K is identical to the input K)? >>> >>> Any assistance would be greatly appreciated. >>> >>> Thanks, >>> Alex >>> >> >