hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suma Shivaprasad (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master
Date Fri, 01 Jun 2018 16:01:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498160#comment-16498160

Suma Shivaprasad commented on YARN-8372:

Fixed checkstyle errors. Unable to reproduce test failures locally( Not sure if this is a
flaky test). Retriggerring tests with new patch.

> ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell
App Master
> -----------------------------------------------------------------------------------------------
>                 Key: YARN-8372
>                 URL: https://issues.apache.org/jira/browse/YARN-8372
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: distributed-shell
>            Reporter: Charan Hebri
>            Assignee: Suma Shivaprasad
>            Priority: Major
>         Attachments: YARN-8372.1.patch, YARN-8372.2.patch, YARN-8372.3.patch
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding onShutdownRequest call
for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application attempt
fails due to a NM restart (NM where the DS AM is running), an ApplicationAttemptNotFoundException
is thrown and all containers for that attempt including the ones that are running on other
NMs are killed by the AM and marked as COMPLETE. The subsequent attempt spawns new containers
just like a new attempt. This behavior is different to a Map Reduce application where the
containers are not killed.
> cc [~rohithsharma]

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message