hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-4334) Ability to avoid ResourceManager recovery if state store is "too old"
Date Thu, 05 Nov 2015 20:26:27 GMT
Jason Lowe created YARN-4334:
--------------------------------

             Summary: Ability to avoid ResourceManager recovery if state store is "too old"
                 Key: YARN-4334
                 URL: https://issues.apache.org/jira/browse/YARN-4334
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: resourcemanager
            Reporter: Jason Lowe


There are times when a ResourceManager has been down long enough that ApplicationMasters and
potentially external client-side monitoring mechanisms have given up completely.  If the ResourceManager
starts back up and tries to recover we can get into situations where the RM launches new application
attempts for the AMs that gave up, but then the client _also_ launches another instance of
the app because it assumed everything was dead.

It would be nice if the RM could be optionally configured to avoid trying to recover if the
state store was "too old."  The RM would come up without any applications recovered, but we
would avoid a double-submission situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message