hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6153) keepContainer does not work when AM retry window is set
Date Fri, 24 Feb 2017 18:58:44 GMT

    [ https://issues.apache.org/jira/browse/YARN-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883323#comment-15883323
] 

Jian He commented on YARN-6153:
-------------------------------

[~kyungwan nam], thanks for updating, patch looks good to me overall, 
I found there are several places in RMAppAttemptImpl where it uses below way to retrieve its
RMApp,  
{code}
appAttempt.rmContext.getRMApps().get(
                appAttempt.getAppAttemptId().getApplicationId()
{code}
I think we can change the RMAppAttemptImpl constructor to take RMApp as one parameter so that
we won't need the hashmap to back trace its RMApp, would you like to make the change ?

> keepContainer does not work when AM retry window is set
> -------------------------------------------------------
>
>                 Key: YARN-6153
>                 URL: https://issues.apache.org/jira/browse/YARN-6153
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: kyungwan nam
>         Attachments: YARN-6153.001.patch, YARN-6153.002.patch, YARN-6153.003.patch, YARN-6153.004.patch,
YARN-6153.005.patch
>
>
> yarn.resourcemanager.am.max-attempts has been configured to 2 in my cluster.
> I submitted a YARN application (slider app) that keepContainers=true, attemptFailuresValidityInterval=300000.
> it did work properly when AM was failed firstly.
> all containers launched by previous AM were resynced with new AM (attempt2) without killing
containers.
> after 10 minutes, I thought AM failure count was reset by attemptFailuresValidityInterval
(5 minutes).
> but, all containers were killed when AM was failed secondly. (new AM attempt3 was launched
properly)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message