hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MENG DING (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1449) Protocol changes in NM side to support change container resource
Date Mon, 08 Jun 2015 16:05:01 GMT

     [ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

MENG DING updated YARN-1449:
    Attachment: YARN-1449.1.patch

Attaching patch for review.

The patch has passed the {{test-patch}} script, and includes the following changes:
* Added *ChangeContainersResourceRequest*/*ChangeContainersResourceResponse* protocol
* Added *changeContainersResource* method in *ContainerManagementProtocol*
* Updated *ContainerManagerImpl* to implement the container resource change actions
* Updated unit tests

The patch does *NOT* include the implementation of changes to the *NodeStatus* yet. I would
like to have some further discussion on the changes to the NodeStatusProto, especially now
we want to update the node heartbeat response to let RM confirm the final resource changes
with NM. [~leftnoteasy], do you think it would be a good idea to reopen YARN-1644 so that
I can initiate the discussion and post patches in that thread for NodeStatus changes? If you
think it is not necessary, I will discuss in this thread. 

I was able to reuse a lot of the code from the original patch :-). The major differences are
listed as follows:

* The *ChangeContainersResourceResponse* now returns a containerID to exception Map for failed
requests, instead of a list of failed containerIDs. This is to be consistent with other APIs.
* In {{ContainerManagerImpl.java}}
** More strict checking of the resource change request, including checking token expiration
and RM identifier.
** Reject resource change requests with both resource increase and decrease specified for
the same container in the same request.
** Check validity of the target resource. For decrease request, the target resource must fit
in the current resource, otherwise, the request will be rejected right away.
** Added a {{CHANGE_CONTAINER}} event so that container resource change and nodemanager metrics
updates will be routed to {{ContainerImpl}}. I believe this is more consistent with the current
event model (e.g., from {{CONTAINER_LAUNCHED}} event to {{START_MONITORING_CONTAINER}}).
** Synchronize the calls to change/stop/getstatus of containers.
* In {{ContainerImpl}}
** The {{Resource}} field must be updated now after each successful resource change. It will
be used to compare against any invalid resource change coming from AM.
** The nodemanager metrics needs to be updated as well.
** Fire {{CHANGE_MONITORING_CONTAINER}} event in {{ContainerResourceChangeTransition}}.

Thanks a lot.

> Protocol changes in NM side to support change container resource
> ----------------------------------------------------------------
>                 Key: YARN-1449
>                 URL: https://issues.apache.org/jira/browse/YARN-1449
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Wangda Tan (No longer used)
>         Attachments: YARN-1449.1.patch, yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch,
> As described in YARN-1197, we need add API/implementation changes,
> 1) Add a "changeContainersResources" method in ContainerManagementProtocol
> 2) Can get succeed/failed increased/decreased containers in response of "changeContainersResources"
> 3) Add a "new decreased containers" field in NodeStatus which can help NM notify RM such

This message was sent by Atlassian JIRA

View raw message