Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of john.r.bond@gmail.com
 designates 209.85.213.48 as permitted sender)
MIME-Version: 1.0
Date: Wed, 14 Dec 2011 15:34:28 +0100
Message-ID: 
 <CAAEq_+uEYgc4o2Wv9MkTnJHm-VfzL2KrOFu_w8z99uUYVV9CFg@mail.gmail.com>
Subject: Overriding mapred.tasktracker.expiry.interval on a per-job basis
From: John Bond <john.r.bond@gmail.com>
To: common-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Hello,

Im running a map/reduce job which dose not send progress updates
during the reduces phase so if this takes longer then 10 minutes the
task is seen to fail and restarted.  Under normall operations this
works as the reduce phase only takes a few minutes; however i am
trying to run this job with some historical data and the reduce phase
is taking longer then 10 minutes and constantly being restarted.

Obviously the correct fix is to implement a reporter[1] which has been
corrected in the dev branch and will be rolled out once it has gone
through release management 8-|.  In the mean time is there a way to
override mapred.tasktracker.expiry.interval on for a specific job
without changing mapred-site.xml and restarting the cluster.

I attempted to do the following:

`hadoop jar /path/to/jar/job.jar  class.to.run
-Dmapred.tasktracker.expiry.interval=600000000 arg1 arg2`

And in the job.conf i can see the following is set however the jobs
are still seen as failed after 10 minutes

mapred.tasktracker.expiry.interval	600000000

Thanks
John


[1]http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reporter.html