hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MENG DING (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1197) Support changing resources of an allocated container
Date Mon, 15 Jun 2015 22:03:07 GMT

    [ https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586999#comment-14586999

MENG DING commented on YARN-1197:

[~sandyr], Yes. The key assumption is that by the time the Application Master requests resource
decrease from RM for a particular container, that container should have already reduced its
resource usage. Therefore, RM can immediately allocate resource to others. 

So to summarize the main idea:
* Both container resource increase and decrease requests go through RM. This eliminates the
race condition where while a container increase is in progress, a decrease for the same container
takes place.
* There is no need for AM-NM protocol anymore. This greatly simplifies the logic for application
* Resource decrease can happen immediately in RM, and the actual enforce/monitor of the decrease
can happen offline, as mentioned by Vinod.
* Resource increase, on the other hand, needs more thoughts. 
** In the current design, the RM gives out an increase token to be used by AM to initiate
the increase on NM. There is no need for this. RM can notify the increase to NM through RM-NM
heartbeat response.
** RM still needs to wait for an acknowledgement from NM to confirm that the increase is done
before sending out response to AM. This will take two heartbeat cycles, but this is not much
worse than giving out a token to AM first, and then letting AM initiating the increase.
** Since RM needs to wait for acknowledgement from NM to confirm the increase, we must handle
such cases as timeout, NM restart/recovery, etc. So we probably still need to have a container
increase token, and token expiration logic for this purpose, but the token will be sent to
NM through RM-NM heartbeat protocol. (I am still working out the details)

> Support changing resources of an allocated container
> ----------------------------------------------------
>                 Key: YARN-1197
>                 URL: https://issues.apache.org/jira/browse/YARN-1197
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: api, nodemanager, resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Wangda Tan
>         Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, YARN-1197_Design.pdf
> The current YARN resource management logic assumes resource allocated to a container
is fixed during the lifetime of it. When users want to change a resource 
> of an allocated container the only way is releasing it and allocating a new container
with expected size.
> Allowing run-time changing resources of an allocated container will give us better control
of resource usage in application side

This message was sent by Atlassian JIRA

View raw message