Return-Path: Delivered-To: apmail-lucene-hadoop-commits-archive@locus.apache.org Received: (qmail 26264 invoked from network); 11 Oct 2006 19:33:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 11 Oct 2006 19:33:10 -0000 Received: (qmail 18117 invoked by uid 500); 11 Oct 2006 19:33:10 -0000 Delivered-To: apmail-lucene-hadoop-commits-archive@lucene.apache.org Received: (qmail 18037 invoked by uid 500); 11 Oct 2006 19:33:10 -0000 Mailing-List: contact hadoop-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-commits@lucene.apache.org Received: (qmail 18028 invoked by uid 99); 11 Oct 2006 19:33:09 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Oct 2006 12:33:09 -0700 X-ASF-Spam-Status: No, hits=-9.4 required=10.0 tests=ALL_TRUSTED,NO_REAL_NAME X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [140.211.166.113] (HELO eris.apache.org) (140.211.166.113) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Oct 2006 12:33:09 -0700 Received: by eris.apache.org (Postfix, from userid 65534) id A76AF1A981A; Wed, 11 Oct 2006 12:32:48 -0700 (PDT) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r462911 - in /lucene/hadoop/trunk: CHANGES.txt src/java/org/apache/hadoop/mapred/Task.java Date: Wed, 11 Oct 2006 19:32:48 -0000 To: hadoop-commits@lucene.apache.org From: cutting@apache.org X-Mailer: svnmailer-1.1.0 Message-Id: <20061011193248.A76AF1A981A@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Author: cutting Date: Wed Oct 11 12:32:47 2006 New Revision: 462911 URL: http://svn.apache.org/viewvc?view=rev&rev=462911 Log: HADOOP-598. Fix tasks to retry when reporting completion. Contributed by Owen. Modified: lucene/hadoop/trunk/CHANGES.txt lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/Task.java Modified: lucene/hadoop/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?view=diff&rev=462911&r1=462910&r2=462911 ============================================================================== --- lucene/hadoop/trunk/CHANGES.txt (original) +++ lucene/hadoop/trunk/CHANGES.txt Wed Oct 11 12:32:47 2006 @@ -13,6 +13,9 @@ .999, so that nearly all blocks must be reported before filesystem modifications are permitted. (Konstantin Shvachko via cutting) + 4. HADOOP-598. Fix tasks to retry when reporting completion, so that + a single RPC timeout won't fail a task. (omalley via cutting) + Release 0.7.0 - 2006-10-06 Modified: lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/Task.java URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/Task.java?view=diff&rev=462911&r1=462910&r2=462911 ============================================================================== --- lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/Task.java (original) +++ lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/Task.java Wed Oct 11 12:32:47 2006 @@ -176,10 +176,26 @@ } } - public void done(TaskUmbilicalProtocol umbilical) - throws IOException { - umbilical.progress(getTaskId(), // send a final status report - taskProgress.get(), taskProgress.toString(), phase); - umbilical.done(getTaskId()); + public void done(TaskUmbilicalProtocol umbilical) throws IOException { + int retries = 10; + boolean needProgress = true; + while (true) { + try { + if (needProgress) { + // send a final status report + umbilical.progress(getTaskId(), taskProgress.get(), + taskProgress.toString(), phase); + needProgress = false; + } + umbilical.done(getTaskId()); + return; + } catch (IOException ie) { + LOG.warn("Failure signalling completion: " + + StringUtils.stringifyException(ie)); + if (--retries == 0) { + throw ie; + } + } + } } }