hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5311) Document graceful decommission CLI and usage
Date Wed, 08 Mar 2017 22:40:38 GMT

    [ https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902101#comment-15902101
] 

Junping Du commented on YARN-5311:
----------------------------------

Sorry for coming late on this as reviewing document is always a not-easy work. 
Thanks [~elek] for the patch, some comments so far:
1. In overview, we should explain some high level use cases - like elasticity for yarn nodes
in public cloud infrastructure, etc. Also, we should mention timeout tracking in client and
server side and their differences in prospective of IT operations.

2. As far as I remember, we don't support specified timeout value in exclude file for client
side timeout tracking initially. It seems YARN-4676 only support that for server side tracking.
We should mention that explicitly.

3. Also, for exclude file, we should mention currently we only support plain text (no timeout
value) and XML. However, we have plan to support JSON format in future - please refer YARN-5536
for more details.

4. We should mention the behavior for RM get restarted/failed over, the decommissioning node
will get decommissioned after RM come back as no timeout value get preserved so far. We should
enhance it later - with YARN-5464 get fixed. So far we can just mention the current behavior
as a NOTE but we can update later once we have better solution.

Some NITs:

bq. (Note: It isn't needed to restart resourcemanager in case of changing the exclude-path
as it's reread at every `refresNodes` command)
It is unnecessary to restart RM in case of changing the exclude-path as this config will be
read again for every 'refreshNodes' command

bq. +* WAIT_CONTAINER --- wait for running containers to complete.
Capitalize "w" for wait as other items.

bq. +* WAIT_APP --- wait for running application to complete (after all containers complete)
Same comments above.

> Document graceful decommission CLI and usage
> --------------------------------------------
>
>                 Key: YARN-5311
>                 URL: https://issues.apache.org/jira/browse/YARN-5311
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: documentation
>    Affects Versions: 2.9.0
>            Reporter: Junping Du
>            Assignee: Elek, Marton
>         Attachments: YARN-5311.001.patch, YARN-5311.002.patch, YARN-5311.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message