Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 98873 invoked from network); 11 Aug 2010 12:52:30 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 Aug 2010 12:52:30 -0000 Received: (qmail 26186 invoked by uid 500); 11 Aug 2010 12:52:30 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 25876 invoked by uid 500); 11 Aug 2010 12:52:28 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 25843 invoked by uid 99); 11 Aug 2010 12:52:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Aug 2010 12:52:26 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of psdc1978@gmail.com designates 74.125.83.176 as permitted sender) Received: from [74.125.83.176] (HELO mail-pv0-f176.google.com) (74.125.83.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Aug 2010 12:52:20 +0000 Received: by pvg3 with SMTP id 3so33700pvg.35 for ; Wed, 11 Aug 2010 05:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=lpHb96eznminVa3IrnQpOhxtW+k9jaW3mdneGcRIU3I=; b=EBoWgdqFEzz2kytOfqM8j9SAUADeaycQ0ExuhLH243SyKC+t9j+mSQlm55N9Tk9h8k QK1mFwL4RR7hgQKjOwAPmLqrlI7g57yI8UweqeWuA0YdZgGzgDy8w85LJ9Rtkzo/m2cY 1X9BVTEMIFR+gbPXgfTnvtCQlFQxvKyBlqZMo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=cy0ghV1IjWlY5kcSejqHK5lHaK+59AH1E1J/kOINSgLJDsN6Utc8dUYiS7niKfaAEV VXBhuuPHFtXUuRCc2qpRWPhYj+wit9lbE7kuCfAsmrZgI0Fu8HCb3U9TpCiRZVNEcqEB Hnn6ZF4WDIRdar1IF9pux9gOeMJdbdf5Li1GA= MIME-Version: 1.0 Received: by 10.114.24.15 with SMTP id 15mr21725500wax.122.1281531119958; Wed, 11 Aug 2010 05:51:59 -0700 (PDT) Received: by 10.114.173.1 with HTTP; Wed, 11 Aug 2010 05:51:59 -0700 (PDT) In-Reply-To: References: Date: Wed, 11 Aug 2010 13:51:59 +0100 Message-ID: Subject: Cardinality of ReduceCopiers to map outputs From: Pedro Costa To: mapreduce-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi, 1 - I would like to know if a Map Task can produce more than 1 map output per execution? 2 - A Map Task can't be reused, right? When a Map Task instance produced a map outputs, this instance will end, right? 3 - I would like to know if a ReduceCopier will fetch only 1 map output at a time and saves it in a file or in memory before merging them? For example, if we've a ReduceCopier that fetches 4 map outputs, the reduce will save the 4 outputs in separate and only after all map outputs are copied to the reduce side, that the ReduceCopier will merge them, right? Thanks, -- Pedro -- Pedro