hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-1373) Transition RMApp and RMAppAttempt state to RUNNING after restart for recovered running apps
Date Wed, 30 Oct 2013 07:21:26 GMT
Bikas Saha created YARN-1373:
--------------------------------

             Summary: Transition RMApp and RMAppAttempt state to RUNNING after restart for
recovered running apps
                 Key: YARN-1373
                 URL: https://issues.apache.org/jira/browse/YARN-1373
             Project: Hadoop YARN
          Issue Type: Sub-task
            Reporter: Bikas Saha


Currently the RM moves recovered app attempts to the a terminal recovered state and starts
a new attempt. Instead, it will have to transition the last attempt to a running state such
that it can proceed as normal once the running attempt has resynced with the ApplicationMasterService
(YARN-1365 and YARN-1366). If the RM had started the application container before dying then
the AM would be up and trying to contact the RM. The RM may have had died before launching
the container. For this case, the RM should wait for AM liveliness period and issue a kill
container for the stored master container. It should transition this attempt to some RECOVER_ERROR
state and proceed to start a new attempt.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message