hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
Date Sun, 23 Feb 2014 04:17:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13909667#comment-13909667
] 

Jian He commented on YARN-1686:
-------------------------------

Good catch! thanks for working on this.

> NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-1686
>                 URL: https://issues.apache.org/jira/browse/YARN-1686
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Rohith
>            Assignee: Rohith
>             Fix For: 3.0.0
>
>         Attachments: YARN-1686.1.patch
>
>
> During start of NodeManager,if registration with resourcemanager throw exception then
nodemager shutdown happens. 
> Consider case where NM-1 is registered with RM. RM issued Resync to NM. If any exception
thrown in "resyncWithRM" (starts new thread which does not handle exception) during RESYNC
evet, then this thread is lost. NodeManger enters hanged state. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message