hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4334) Ability to avoid ResourceManager recovery if state store is "too old"
Date Fri, 20 Nov 2015 19:44:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018637#comment-15018637
] 

Hadoop QA commented on YARN-4334:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 5m 31s {color} | {color:red}
Docker failed to build yetus/hadoop:date2015-11-20. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12773579/YARN-4334.2.patch
|
| JIRA Issue | YARN-4334 |
| Powered by | Apache Yetus   http://yetus.apache.org |
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9749/console |


This message was automatically generated.



> Ability to avoid ResourceManager recovery if state store is "too old"
> ---------------------------------------------------------------------
>
>                 Key: YARN-4334
>                 URL: https://issues.apache.org/jira/browse/YARN-4334
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Jason Lowe
>            Assignee: Chang Li
>         Attachments: YARN-4334.2.patch, YARN-4334.patch, YARN-4334.wip.2.patch, YARN-4334.wip.3.patch,
YARN-4334.wip.4.patch, YARN-4334.wip.patch
>
>
> There are times when a ResourceManager has been down long enough that ApplicationMasters
and potentially external client-side monitoring mechanisms have given up completely.  If the
ResourceManager starts back up and tries to recover we can get into situations where the RM
launches new application attempts for the AMs that gave up, but then the client _also_ launches
another instance of the app because it assumed everything was dead.
> It would be nice if the RM could be optionally configured to avoid trying to recover
if the state store was "too old."  The RM would come up without any applications recovered,
but we would avoid a double-submission situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message