Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 88674 invoked from network); 1 Jun 2007 14:58:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Jun 2007 14:58:39 -0000 Received: (qmail 28817 invoked by uid 500); 1 Jun 2007 14:58:42 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 28689 invoked by uid 500); 1 Jun 2007 14:58:41 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 28559 invoked by uid 99); 1 Jun 2007 14:58:41 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2007 07:58:41 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2007 07:58:36 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 3B0D5714198 for ; Fri, 1 Jun 2007 07:58:16 -0700 (PDT) Message-ID: <10250788.1180709896239.JavaMail.jira@brutus> Date: Fri, 1 Jun 2007 07:58:16 -0700 (PDT) From: "Hadoop QA (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1431) Map tasks can't timeout for failing to call progress In-Reply-To: <13907972.1180066936133.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500733 ] Hadoop QA commented on HADOOP-1431: ----------------------------------- +1 http://issues.apache.org/jira/secure/attachment/12358717/HADOOP-1431_3_20070601.patch applied and successfully tested against trunk revision r543222. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/226/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/226/console > Map tasks can't timeout for failing to call progress > ---------------------------------------------------- > > Key: HADOOP-1431 > URL: https://issues.apache.org/jira/browse/HADOOP-1431 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.13.0 > Reporter: Owen O'Malley > Assignee: Arun C Murthy > Priority: Blocker > Fix For: 0.13.0 > > Attachments: HADOOP-1431_1_20070525.patch, HADOOP-1431_2_20070530.patch, HADOOP-1431_3_20070601.patch > > > Currently the map task runner creates a thread that calls progress every second to keep the system from killing the map if the sort takes too long. This is the wrong approach, because it will cause stuck tasks to not be killed. The right solution is to have the sort call progress as it actually makes progress. This is part of what is going on in HADOOP-1374. A map gets stuck at 100% progress, but not done. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.