Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A13EB108B6 for ; Mon, 4 Nov 2013 18:58:19 +0000 (UTC) Received: (qmail 51928 invoked by uid 500); 4 Nov 2013 18:58:19 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 51885 invoked by uid 500); 4 Nov 2013 18:58:19 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 51835 invoked by uid 99); 4 Nov 2013 18:58:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Nov 2013 18:58:19 +0000 Date: Mon, 4 Nov 2013 18:58:19 +0000 (UTC) From: "Hadoop QA (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813112#comment-13813112 ] Hadoop QA commented on YARN-1210: --------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611997/YARN-1210.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2362//console This message is automatically generated. > During RM restart, RM should start a new attempt only when previous attempt exits for real > ------------------------------------------------------------------------------------------ > > Key: YARN-1210 > URL: https://issues.apache.org/jira/browse/YARN-1210 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Vinod Kumar Vavilapalli > Assignee: Omkar Vinit Joshi > Attachments: YARN-1210.1.patch, YARN-1210.2.patch > > > When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins ( the expiry interval). This way we'll minimize multiple AMs racing with each other. This can help issues with downstream components like Pig, Hive and Oozie during RM restart. > In the mean while, new apps will proceed as usual as existing apps wait for recovery. > This can continue to be useful after work-preserving restart, so that AMs which can properly sync back up with RM can continue to run and those that don't are guaranteed to be killed before starting a new attempt. -- This message was sent by Atlassian JIRA (v6.1#6144)