Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Fri, 16 Oct 2015 16:46:05 +0000 (UTC)
From: "Junping Du (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12905581.1445013450000.60360.1445013965267@Atlassian.JIRA>
In-Reply-To: <JIRA.12905581.1445013450000@Atlassian.JIRA>
References: <JIRA.12905581.1445013450000@Atlassian.JIRA>
 <JIRA.12905581.1445013450247@arcas>
Subject: [jira] [Updated] (YARN-4274) NodeStatusUpdaterImpl should register
 to RM again after a non-fatal exception happen before
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/YARN-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated YARN-4274:
-----------------------------
    Description: From YARN-3896, an non-fatal exception like response ID mismatch between NM and RM (due to a race condition) will cause NM stop working. I think we should make it more robust to tolerant a few failure in registering to RM with retry a few times.  (was: From YARN-3896, an non-fatal exception like response ID mismatch between NM and RM (due to a race condition) will cause NM stop working. I think we should make it more robust to tolerant a few times failure in registering to RM.)

> NodeStatusUpdaterImpl should register to RM again after a non-fatal exception happen before
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-4274
>                 URL: https://issues.apache.org/jira/browse/YARN-4274
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Junping Du
>            Assignee: Junping Du
>
> From YARN-3896, an non-fatal exception like response ID mismatch between NM and RM (due to a race condition) will cause NM stop working. I think we should make it more robust to tolerant a few failure in registering to RM with retry a few times.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)