hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-4685) AM blacklisting result in application to get hanged
Date Fri, 17 Jun 2016 05:14:05 GMT

     [ https://issues.apache.org/jira/browse/YARN-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Rohith Sharma K S updated YARN-4685:
    Attachment: YARN-4685-workaround.patch

> AM blacklisting result in application to get hanged
> ---------------------------------------------------
>                 Key: YARN-4685
>                 URL: https://issues.apache.org/jira/browse/YARN-4685
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>            Priority: Critical
>         Attachments: YARN-4685-workaround.patch
> AM blacklist addition or removal is updated only when RMAppAttempt is scheduled i.e {{RMAppAttemptImpl#ScheduleTransition#transition}}.
But once attempt is scheduled if there is any removeNode/addNode in cluster then this is not
updated to {{BlackListManager#refreshNodeHostCount}}. This leads BlackListManager to operate
on stale NM's count. And application is in ACCEPTED state and wait forever even if blacklisted
nodes are reconnected with clearing disk space.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message