Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D604B10339 for ; Mon, 10 Mar 2014 03:49:47 +0000 (UTC) Received: (qmail 70981 invoked by uid 500); 10 Mar 2014 03:49:47 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 70946 invoked by uid 500); 10 Mar 2014 03:49:44 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 70912 invoked by uid 99); 10 Mar 2014 03:49:43 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Mar 2014 03:49:43 +0000 Date: Mon, 10 Mar 2014 03:49:43 +0000 (UTC) From: "Xuan Gong (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1764) Handle RM fail overs after the submitApplication call. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925427#comment-13925427 ] Xuan Gong commented on YARN-1764: --------------------------------- bq. Can you add a log in YarnClientImpl when we retry the submission? DONE bq. Can you improvement the documentation of submitApp() API in ApplicationClientProtocol about the clients needing to retry when the specified exception happens? ADDED bq. Also add the exception to the documentation to the base protocol. ADDED bq. Document YarnClient's submit API that we automatically retry when this issue happens. ADDED bq. All the new files added in the patch have some formatting issues. FIXED bq. In both the test-cases, after the fail-over, we assert for the states that are not expected (assertFalse). Can we explicitly test for the cases that we expect (assertTrue) ? changed bq. I think we should also mark getApplicationReport() to be idempotent in this patch itself as RM can fail-over after submitApplication() returned but during a getApplicationReport(). We will need to add some tests for this too. ADDED > Handle RM fail overs after the submitApplication call. > ------------------------------------------------------ > > Key: YARN-1764 > URL: https://issues.apache.org/jira/browse/YARN-1764 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Xuan Gong > Assignee: Xuan Gong > Attachments: YARN-1764.1.patch, YARN-1764.2.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)