Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 42AC7D75E for ; Thu, 23 Aug 2012 09:25:38 +0000 (UTC) Received: (qmail 77232 invoked by uid 500); 23 Aug 2012 09:25:33 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 77061 invoked by uid 500); 23 Aug 2012 09:25:31 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 77033 invoked by uid 99); 23 Aug 2012 09:25:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Aug 2012 09:25:29 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Jan.Lukavsky@firma.seznam.cz designates 77.75.74.246 as permitted sender) Received: from [77.75.74.246] (HELO posta.szn.cz) (77.75.74.246) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Aug 2012 09:25:23 +0000 Received: from [10.0.2.22] (10.0.2.22) by posta.szn.cz (10.0.3.149) with Microsoft SMTP Server id 14.2.298.4; Thu, 23 Aug 2012 11:25:00 +0200 Message-ID: <5035F6ED.4000104@firma.seznam.cz> Date: Thu, 23 Aug 2012 11:25:01 +0200 From: =?ISO-8859-1?Q?Jan_Lukavsk=FD?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Subject: Running map tasks after all reduces have finished Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.0.2.22] X-Virus-Checked: Checked by ClamAV on apache.org Hi all, we are seeing strange behaviour of JobTracker in the following scenario: - job finishes map phase and starts reduce - after the shuffle phase of all reducers we loose a tasktracker, that doesn't run any reducer - so all remaining reducers are still running in the reduce phase - map tasks that were running on the lost tasktracker are rescheduled - reduces may finish earlier than the rescheduled map tasks and so the job is blocked waiting for the maps to finish, although their output is simple discarded Is this behaviour a bug or feature? :) I haven't found any JIRA that would describe it, if there exists one can anyone point me out? Thanks, Jan