hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.
Date Wed, 13 May 2015 09:54:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541669#comment-14541669

Junping Du commented on YARN-41:

bq. In the case of work-preserving NM restart (or under supervision as YARN-2331 calls it),
we can make the NM not do an unregister?
I think latest patch (-4.patch) already did this, but my concern is a little broader: does
user (or management tools for YARN cluster, like: Ambari) expect the same behavior for kill
-9 on NM daemon and shutdown for NM daemon? With current patch (assume NM work preserving
is disabled), user will find RM don't have this NM info anymore if shutdown NM daemon while
the kill -9 on NM daemon has the old behavior (RM still has NM info with running state and
switch to LOST after timeout). Previously, the behavior of these two operations is the same.
However, I don't think we care too much about consistency behavior for these two operations,
but would like to call it out loudly to make sure we don't miss anything important. 

> The RM should handle the graceful shutdown of the NM.
> -----------------------------------------------------
>                 Key: YARN-41
>                 URL: https://issues.apache.org/jira/browse/YARN-41
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, resourcemanager
>            Reporter: Ravi Teja Ch N V
>            Assignee: Devaraj K
>              Labels: BB2015-05-TBR
>         Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch,
YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, YARN-41-4.patch, YARN-41.patch
> Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown

This message was sent by Atlassian JIRA

View raw message