Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 84205 invoked from network); 10 Aug 2006 23:41:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 10 Aug 2006 23:41:56 -0000 Received: (qmail 76002 invoked by uid 500); 10 Aug 2006 23:41:55 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 75981 invoked by uid 500); 10 Aug 2006 23:41:55 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 75972 invoked by uid 99); 10 Aug 2006 23:41:55 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Aug 2006 16:41:55 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [207.126.228.149] (HELO rsmtp1.corp.yahoo.com) (207.126.228.149) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Aug 2006 16:41:44 -0700 Received: from [207.126.231.236] (eric14-mac.corp.yahoo.com [207.126.231.236]) (authenticated bits=0) by rsmtp1.corp.yahoo.com (8.13.6/8.13.6/y.rout) with ESMTP id k7ANeHmB043380 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NO) for ; Thu, 10 Aug 2006 16:41:13 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:content-type:message-id: content-transfer-encoding:from:subject:date:to:x-mailer; b=LMhCQJ7zdExM+0Evm8mDJr7UqYoRVuvQoBnqN0Stl8mz92mGeuv5ALy7Rmzq/9Rh Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <18622998.1155238695841.JavaMail.jira@brutus> References: <18622998.1155238695841.JavaMail.jira@brutus> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Eric Baldeschwieler Subject: Re: [jira] Commented: (HADOOP-181) task trackers should not restart for having a late heartbeat Date: Thu, 10 Aug 2006 16:41:12 -0700 To: hadoop-dev@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N (RESEND, MENT TO ATTACH THE COMMENT BELOW TO THIS POSTING) Why don't we include documenting this as part of the the "map-reduce walk-through" sprint item? ----- On reintegrating lost task trackers... It does seem like we should do this to me, but we need to make sure we reason through how this effects corner cases, what invariants the system does maintain and so on. I suggest we work this through, and then go forward with this patch (modified if we find any corner cases) and post the reasoning, so we can review it as this logic evolves. (And update any existing documentation in this area of course...) On Aug 10, 2006, at 12:38 PM, Devaraj Das (JIRA) wrote: > [ http://issues.apache.org/jira/browse/HADOOP-181? > page=comments#action_12427327 ] > > Devaraj Das commented on HADOOP-181: > ------------------------------------ > > Doug, does it make sense to do what is done in this patch only when > speculative execution is on? > >> task trackers should not restart for having a late heartbeat >> ------------------------------------------------------------ >> >> Key: HADOOP-181 >> URL: http://issues.apache.org/jira/browse/HADOOP-181 >> Project: Hadoop >> Issue Type: Bug >> Components: mapred >> Reporter: Owen O'Malley >> Assigned To: Devaraj Das >> Fix For: 0.6.0 >> >> Attachments: lost-heartbeat.patch >> >> >> TaskTrackers should not close and restart themselves for having a >> late heartbeat. The JobTracker should just accept their current >> status. > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the > administrators: http://issues.apache.org/jira/secure/ > Administrators.jspa > - > For more information on JIRA, see: http://www.atlassian.com/ > software/jira > >