Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D87610A45 for ; Sun, 6 Oct 2013 16:54:51 +0000 (UTC) Received: (qmail 56969 invoked by uid 500); 6 Oct 2013 16:54:49 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 56847 invoked by uid 500); 6 Oct 2013 16:54:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 56819 invoked by uid 99); 6 Oct 2013 16:54:43 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Oct 2013 16:54:43 +0000 Date: Sun, 6 Oct 2013 16:54:43 +0000 (UTC) From: "Bikas Saha (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1278) New AM does not start after rm restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787678#comment-13787678 ] Bikas Saha commented on YARN-1278: ---------------------------------- bq. I think on resync, we shouldn't destroy app resources. That is desired anyways as there is no need to just relocalize everything because of RM resync. That is the way it should be. Resync is different from restart. Even restart shouldnt > New AM does not start after rm restart > -------------------------------------- > > Key: YARN-1278 > URL: https://issues.apache.org/jira/browse/YARN-1278 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.1.1-beta > Reporter: Yesha Vora > Assignee: Hitesh Shah > Priority: Blocker > Attachments: YARN-1278.1.patch > > > The new AM fails to start after RM restarts. It fails to start new Application master and job fails with below error. > /usr/bin/mapred job -status job_1380985373054_0001 > 13/10/05 15:04:04 INFO client.RMProxy: Connecting to ResourceManager at hostname > Job: job_1380985373054_0001 > Job File: /user/abc/.staging/job_1380985373054_0001/job.xml > Job Tracking URL : http://hostname:8088/cluster/app/application_1380985373054_0001 > Uber job : false > Number of maps: 0 > Number of reduces: 0 > map() completion: 0.0 > reduce() completion: 0.0 > Job state: FAILED > retired: false > reason for failure: There are no failed tasks for the job. Job is failed due to some other reason and reason can be found in the logs. > Counters: 0 -- This message was sent by Atlassian JIRA (v6.1#6144)