Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 18 May 2016 13:31:12 +0000 (UTC)
From: "Sunil G (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12923436.1450707621000.223561.1463578272944@Atlassian.JIRA>
In-Reply-To: <JIRA.12923436.1450707621000@Atlassian.JIRA>
References: <JIRA.12923436.1450707621000@Atlassian.JIRA> <JIRA.12923436.1450707621831@arcas>
Subject: [jira] [Commented] (YARN-4494) Recover completed apps
 asynchronously
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Wed, 18 May 2016 13:31:14 -0000


    [ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288950#comment-15288950 ] 

Sunil G commented on YARN-4494:
-------------------------------

Hi [~kasha]

As per the problem statement, if we are starting to recover complete apps asynchronously, we may not know when this recovery will be completed. So if we are getting a query  (getApplication/Attempt etc) during this brief recovery period, we could immediately try to recover the queried app from client (also by blocking the client rpc call), and serve the metrics/state etc. 

So it wont be a lazy recover when there is a request, we can immediately recover and serve it. 

bq.If yes, do we recover everything when someone requests all apps? How about apps that match a specific category?
I was thinking in same line early. But we may block the client call for a long time here till all apps are recovered. There are two options here, 1) block the client call till all apps are recovered (it may be too long, and timeour may happen) 2) error message/exception can be thrown to client indicating that recovery is in progress.
Both these are not very clean solutions. But we have seen some de-merits of recovering completed apps (in case of thousands of completed apps). TO avoid this issue, max-completed applications were configured lesser. cc/[~rohithsharma]

[~kasha], pls share your thoughts. 

> Recover completed apps asynchronously
> -------------------------------------
>
>                 Key: YARN-4494
>                 URL: https://issues.apache.org/jira/browse/YARN-4494
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>
> With RM HA enabled, when recovering apps, recover completed apps asynchronously.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org