[ https://issues.apache.org/jira/browse/YARN-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohith updated YARN-1686:
-------------------------
Attachment: YARN-1686.2.patch
Thank you vinod for your reviewing patch.
I have updated the patch addressing all your comments. Please review new patch.
Jian He, tx for motivation.:-)
> NodeManager.resyncWithRM() does not handle exception which cause NodeManger to Hang.
> ------------------------------------------------------------------------------------
>
> Key: YARN-1686
> URL: https://issues.apache.org/jira/browse/YARN-1686
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.3.0
> Reporter: Rohith
> Assignee: Rohith
> Fix For: 3.0.0
>
> Attachments: YARN-1686.1.patch, YARN-1686.2.patch
>
>
> During start of NodeManager,if registration with resourcemanager throw exception then
nodemager shutdown happens.
> Consider case where NM-1 is registered with RM. RM issued Resync to NM. If any exception
thrown in "resyncWithRM" (starts new thread which does not handle exception) during RESYNC
evet, then this thread is lost. NodeManger enters hanged state.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
|