hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing
Date Mon, 17 Aug 2015 23:26:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700423#comment-14700423

Wangda Tan commented on YARN-1644:

Discussed with [~jianhe], some thoughts:

There're 3 corner cases we need to handle:
1. AM send decrease container to RM before send increase container to NM
2. RM crashes after issued increase container, and AM increase container to NM during NM registering
3. Same as 2. but AM send decrease container request to RM before RM receives NM reported
increase container.

What we may need to consider is "version of container", RM will add 1 to container version
if increased/decreased a container. And container-version will be added to ContainerTokenIdentifier,
NM reported increased container and NMContainerStatus while registering.

>From RM's view, it should keep the latest updated container resource. So for above corner
1. Result: container decreased
2. Result: container increased
3. Result: container decreased (because the latest resource AM sent to RM is decrese).

So in RM side, it will check:
if (rm.version >= nm.version) {
	// keep existing container in RM unchanged, and tell NM about this
	// why include "==" here is, if rm.version == nm.version, corner case #3 happened.
} else {
	// change container in RM

So in summary what we need in protocol is:
- Container-version in ContainerTokenIdentifier
- COntainer-version in NMContainerStatus
- add a IncreasedContainer of NM-RM heartbeat, and include container-version in IncreasedContainer.

Thoughts? [~mding]

> RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing
> -----------------------------------------------------------------------------------------
>                 Key: YARN-1644
>                 URL: https://issues.apache.org/jira/browse/YARN-1644
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Wangda Tan
>            Assignee: MENG DING
>         Attachments: YARN-1644-YARN-1197.4.patch, YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch,
YARN-1644.2.patch, YARN-1644.3.patch, yarn-1644.1.patch

This message was sent by Atlassian JIRA

View raw message