hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MENG DING (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers
Date Thu, 08 Oct 2015 14:12:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948709#comment-14948709
] 

MENG DING commented on YARN-1509:
---------------------------------

Hi, [~bikassaha]

Thanks a lot for the valuable comments!

bq. Why are there separate methods for increase and decrease instead of a single method to
change the container resource size? By comparing the existing resource allocation to a container
and the new requested resource allocation, it should be clear whether an increase or decrease
is being requested.

As discussed in the design stage, and also described in the design doc, the reason to separate
the increase/decrease requests in the APIs and AMRM protocol is to make sure that users will
make a conscious decision when they are making these requests. It is also much easier to catch
any potential mistakes that the user could make. For example, if a user intends to increase
resource of a container, but for whatever reason mistakenly specifies a target resource that
is smaller than the current resource, RM can catch that and throw exception.

bq. Also, for completeness, is there a need for a cancelContainerResourceChange()? After a
container resource change request has been submitted, what are my options as a user other
than to wait for the request to be satisfied by the RM?

For container resource decrease request, there is practically no chance (and probably no need)
to cancel the request, as it happens immediately when scheduler process the request (this
is similar to the release container request). For container resource increase, the user can
cancel any pending increase request still sitting in RM by sending a decrease request of the
same size of the current container size. I will improve the Javadoc description to make it
clear on this.

bq. If I release the container, then does it mean all pending change requests for that container
should be removed? From a quick look at the patch, it does not look like that is being covered,
unless I am missing something.

You are right that releasing a container should cancel all pending change requests for that
container. This is missing in the current implementation, I will add that.

bq. What will happen if the AM restarts after submitting a change request. Does the AM-RM
re-register protocol need an update to handle the case of re-synchronizing on the change requests?
Whats happens if the RM restarts? If these are explained in a document, then please point
me to the document. The patch did not seem to have anything around this area. So I thought
I would ask

The current implementation handles RM restarts by maintaining a pendingIncrease and pendingDecrease
map, just like the pendingRelease list. This is covered in the design doc.
For AM restarts, I am not sure what we need to do here. Does AM-RM re-register protocol currently
handle the re-synchronize of outstanding new container requests after AM is restarted? Will
you be able to elaborate a little bit on this?

bq. Also, why have the callback interface methods been made non-public? Would that be an incompatible
change?

All interface methods are implicitly public and abstract. The existing public modifier on
these methods are redundant, so I removed them.

> Make AMRMClient support send increase container request and get increased/decreased containers
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-1509
>                 URL: https://issues.apache.org/jira/browse/YARN-1509
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan (No longer used)
>            Assignee: MENG DING
>         Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch, YARN-1509.4.patch,
YARN-1509.5.patch
>
>
> As described in YARN-1197, we need add API in AMRMClient to support
> 1) Add increase request
> 2) Can get successfully increased/decreased containers from RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message