hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Subru Krishnan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1815) Work preserving recovery of Unmanged AMs
Date Tue, 31 May 2016 22:47:13 GMT

     [ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Subru Krishnan updated YARN-1815:
    Attachment: YARN-1815-v4.patch

PFA updated patch (v4) with fix for *TestRMAppAttemptTransitions*. All other test failures
are unrelated (tracked as part of HADOOP-12687 & YARN-5091), verified that they pass locally.

> Work preserving recovery of Unmanged AMs
> ----------------------------------------
>                 Key: YARN-1815
>                 URL: https://issues.apache.org/jira/browse/YARN-1815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Karthik Kambatla
>            Assignee: Subru Krishnan
>            Priority: Critical
>         Attachments: Unmanaged AM recovery.png, YARN-1815-v3.patch, YARN-1815-v4.patch,
yarn-1815-1.patch, yarn-1815-2.patch, yarn-1815-2.patch
> Currently work preserving RM restart recovers unmanaged AMs but it has a couple of shortcomings
- all running containers are killed and completed unmanaged AMs are also recovered as we do
_not_ record final state for unmanaged AMs in the RM StateStore. This JIRA proposes to address
both the shortcomings so that work preserving unmanaged AM recovery works exactly like with
managed AMs

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message