hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1686) NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
Date Mon, 24 Feb 2014 10:56:20 GMT

     [ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohith updated YARN-1686:
-------------------------

    Attachment: YARN-1686.2.patch

Thank you vinod for your reviewing patch.

I have updated the patch addressing all your comments. Please review new patch.

Jian He, tx for motivation.:-)

> NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-1686
>                 URL: https://issues.apache.org/jira/browse/YARN-1686
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Rohith
>            Assignee: Rohith
>             Fix For: 3.0.0
>
>         Attachments: YARN-1686.1.patch, YARN-1686.2.patch
>
>
> During start of NodeManager,if registration with resourcemanager throw exception then
nodemager shutdown happens. 
> Consider case where NM-1 is registered with RM. RM issued Resync to NM. If any exception
thrown in "resyncWithRM" (starts new thread which does not handle exception) during RESYNC
evet, then this thread is lost. NodeManger enters hanged state. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message