Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 02F7310B1A for ; Tue, 3 Dec 2013 18:23:56 +0000 (UTC) Received: (qmail 79323 invoked by uid 500); 3 Dec 2013 18:23:51 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 79184 invoked by uid 500); 3 Dec 2013 18:23:51 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 79177 invoked by uid 99); 3 Dec 2013 18:23:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 18:23:50 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kawa.adam@gmail.com designates 209.85.128.170 as permitted sender) Received: from [209.85.128.170] (HELO mail-ve0-f170.google.com) (209.85.128.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 18:23:45 +0000 Received: by mail-ve0-f170.google.com with SMTP id oy12so11041997veb.1 for ; Tue, 03 Dec 2013 10:23:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=nyNi64U6l0AGlo/pS5APFWNk/R6y1f7j342XHrY01I4=; b=xxieTOa5i3cDXzFud02r1rgVBSYlWG96B+mxTFopNO2fnnkAZU6M7msQx0KAmyft/+ F80I/tbSUEosqeTWUj8X5My0OACWTJrh2AxgmWrJGYmvNiopCPIPY2OdoJ+/h7YrH7Fs Du/RyLHW/XJKgb3YbSB2oILjREytq2N4hUWL99bYUeKuw55TmCDUipQ23aef7SDbip82 vAv8aZCD+VEiqh7oz1k6lQL/N86IaeLWVPcp6l0RojujlmV6uI53XXZwRpJRvgxrQLiQ QD4pu8FNafSm7OnjTA7vp/Y+plfa4xHT92qC3l9KqIbc8D2guUKvnmsHKXySUbQtX3AX 71PQ== MIME-Version: 1.0 X-Received: by 10.52.97.35 with SMTP id dx3mr48913797vdb.18.1386095004955; Tue, 03 Dec 2013 10:23:24 -0800 (PST) Received: by 10.58.197.67 with HTTP; Tue, 3 Dec 2013 10:23:24 -0800 (PST) In-Reply-To: References: Date: Tue, 3 Dec 2013 19:23:24 +0100 Message-ID: Subject: Re: mapreduce.jobtracker.expire.trackers.interval no effect From: Adam Kawa To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf307f31a8656e3504eca5649b X-Virus-Checked: Checked by ClamAV on apache.org --20cf307f31a8656e3504eca5649b Content-Type: text/plain; charset=ISO-8859-1 I did a small test, and I a setting mapred.tasktracker.expiry.interval=60000 worked for me (TT became considered as lost after around 66 seconds). Can the formula be: mapred.tasktracker.expiry.interval + 2 * some-heartbeat-interval-that-is-3-sec-by-default? Otherwise, is the 6 sec some kind of time needed to make a decision to consider TT as lost? 2013/12/3 Hansi Klose > I forget to say that we use Cloudera 2.0.0-mr1-cdh4.2.0 > > > Gesendet: Dienstag, 03. Dezember 2013 um 17:38 Uhr > > Von: "Hansi Klose" > > An: user@hadoop.apache.org > > Betreff: mapreduce.jobtracker.expire.trackers.interval no effect > > > > Hi, > > > > we want to set the heartbeat timout for a tasktracker. > > > > If the tasktracker does not send heartbeats for 60 seconds he should > > be marked as lost. > > > > I found the parameter mapreduce.jobtracker.expire.trackers.interval > > which sounds right to me. > > > > I set > > > > > > mapreduce.jobtracker.expire.trackers.interval > > 60000 > > > > > > in the mapred-site.xml on all servers and restarted the jobtracker and > all tasktrackers. > > > > I started a benchmark "hadoop jar hadoop-examples.jar randomwriter rand" > and every tasktracker gets 2 jobs. > > It is a small test environment. > > > > On one tasktracker i stopped the network. On the jobtracker i could see > the "Seconds since heartbeat" > > increasing. But after 60 seconds the tasktracker was still in the > overview. > > Even in the log of the jobtracker I found nothing. > > > > After over 600 seconds i found the message > > org.apache.hadoop.mapred.JobTracker: Lost tracker ..... > > And the tasktracker wasn't shown any more on the jobtracker. > > > > Isn't this the right setting? > > > > Regards Hansi > > > --20cf307f31a8656e3504eca5649b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I did a small test, and I a setting mapred.tasktracker.exp= iry.interval=3D= 60000=A0worked for me (TT became considered as lost after around 66 = seconds).

Can the formula be: mapred.tasktracker.expiry.interval + 2 *= some-heartbeat-interval-that-is-3-sec-by-default?=A0
Otherwise, = is the 6 sec some kind of time needed to make a decision to consider TT as = lost?
--20cf307f31a8656e3504eca5649b--