hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MENG DING (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1651) CapacityScheduler side changes to support increase/decrease container resource.
Date Tue, 01 Sep 2015 15:13:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725533#comment-14725533
] 

MENG DING commented on YARN-1651:
---------------------------------

Hi, [~leftnoteasy], thanks so much for posting the patch.  

I do have one question regarding the patch. Recall during the design discussion, we agreed
that as long as an increase has not yet completed for a container, we should not process any
other increase/decrease requests for the same container. It seems that this patch will still
process decrease/increase requests even an increase action is ongoing? 

If the following sequence of events happen:
Example 1:
1. AM sends container increase request to RM
2. RM allocates the resource and gives out increase token to AM
3. AM sends decrease request to RM for the same container
4. AM uses the increase token to increase resource on NM
5. NM reports container status back to RM

IIUC, at step 3, this patch will decrease the container size, and remove the container from
allocation expirer. At step 5, this patch will see that the RM container size is smaller than
the reported NM container size, and will tell NM to decrease the container resource. The concern
I have with this approach is that in step 4, the user will think that the increase is successfully
done in NM, but in fact it won't. 

Also, what will happen in the following sequence of events?
Example 2:
1. AM sends container increase request to RM
2. RM allocates the resource and gives out increase token (token1) to AM
3. AM sends a new container increase request for the same container to RM with more resource
4. RM allocates the resource and gives out increase token (token2) to AM
5. AM uses token1 (the one with smaller size) to increase resource on NM, but not token2

IIUC, when RM receives the increase report from NM, it will find out that the RM container
size is larger than the reported NM container size, and do nothing about it, later on when
token2 expires, the entire container will be killed according to the current implementation.
I think this behavior could be quite confusing to the user.

IMHO, at least for the case in example 2, we should delay processing of the second increase
request until the first increase action is completed.

> CapacityScheduler side changes to support increase/decrease container resource.
> -------------------------------------------------------------------------------
>
>                 Key: YARN-1651
>                 URL: https://issues.apache.org/jira/browse/YARN-1651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-1651-1.YARN-1197.patch, YARN-1651-WIP.YARN-1197.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message