Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 26374 invoked from network); 27 Jul 2006 03:20:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 27 Jul 2006 03:20:38 -0000 Received: (qmail 49407 invoked by uid 500); 27 Jul 2006 03:20:38 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 49240 invoked by uid 500); 27 Jul 2006 03:20:37 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 49231 invoked by uid 99); 27 Jul 2006 03:20:37 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jul 2006 20:20:37 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jul 2006 20:20:37 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B3E1C41000A for ; Thu, 27 Jul 2006 03:18:14 +0000 (GMT) Message-ID: <1262117.1153970294734.JavaMail.jira@brutus> Date: Wed, 26 Jul 2006 20:18:14 -0700 (PDT) From: "Owen O'Malley (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-39) Job killed when backup tasks fail In-Reply-To: <2079824794.1140051704684.JavaMail.jira@ajax.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-39?page=comments#action_12423760 ] Owen O'Malley commented on HADOOP-39: ------------------------------------- My goal with this would be to do the equivalent of "make -k" or a "best effort" job. It the option was set, the job would continue after a given TIP had failed 4 times, but that TIP would be abandoned. > Job killed when backup tasks fail > --------------------------------- > > Key: HADOOP-39 > URL: http://issues.apache.org/jira/browse/HADOOP-39 > Project: Hadoop > Issue Type: Bug > Components: mapred > Reporter: Owen O'Malley > > I had a map job with side effects that meant that any speculative tasks would fail. > Currently, the job tracker kills the job when the speculative task fails 4 times. > It would be better to stop scheduling speculative tasks for that fragment, but let the job continue as long as one of the the instances of that fragment continue to run. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira