Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8DFC496E9 for ; Wed, 14 Dec 2011 14:35:01 +0000 (UTC) Received: (qmail 60269 invoked by uid 500); 14 Dec 2011 14:34:57 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 60228 invoked by uid 500); 14 Dec 2011 14:34:57 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 60220 invoked by uid 99); 14 Dec 2011 14:34:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Dec 2011 14:34:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of john.r.bond@gmail.com designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Dec 2011 14:34:49 +0000 Received: by yhpp56 with SMTP id p56so1560894yhp.35 for ; Wed, 14 Dec 2011 06:34:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=wnq7o4ReSRkIXDCk7HiTWgxJJU2xhw7lT7XeaQjj4dw=; b=ZUyFgHBJMwvyVwI5BHVGAczh6hfSp4c2BQIF9TFsM2XwXfkDhv44MoKeHpBfyd3jXF KwbpzRtPHvyqkEL/A/VD8ebZbYyfAQcyJfsX2Q3Yzx07TzOkGeZCeQSoP0fEPYlEO7h1 f2IN4IThYlsEswbkjOxq6Ka+nZDqbTu7Hk/Z8= MIME-Version: 1.0 Received: by 10.236.200.131 with SMTP id z3mr12674909yhn.129.1323873268812; Wed, 14 Dec 2011 06:34:28 -0800 (PST) Received: by 10.236.176.226 with HTTP; Wed, 14 Dec 2011 06:34:28 -0800 (PST) Date: Wed, 14 Dec 2011 15:34:28 +0100 Message-ID: Subject: Overriding mapred.tasktracker.expiry.interval on a per-job basis From: John Bond To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hello, Im running a map/reduce job which dose not send progress updates during the reduces phase so if this takes longer then 10 minutes the task is seen to fail and restarted. Under normall operations this works as the reduce phase only takes a few minutes; however i am trying to run this job with some historical data and the reduce phase is taking longer then 10 minutes and constantly being restarted. Obviously the correct fix is to implement a reporter[1] which has been corrected in the dev branch and will be rolled out once it has gone through release management 8-|. In the mean time is there a way to override mapred.tasktracker.expiry.interval on for a specific job without changing mapred-site.xml and restarting the cluster. I attempted to do the following: `hadoop jar /path/to/jar/job.jar class.to.run -Dmapred.tasktracker.expiry.interval=600000000 arg1 arg2` And in the job.conf i can see the following is set however the jobs are still seen as failed after 10 minutes mapred.tasktracker.expiry.interval 600000000 Thanks John [1]http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reporter.html