hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers
Date Thu, 15 Oct 2015 23:52:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959877#comment-14959877

Bikas Saha commented on YARN-1509:

bq. Increase/Decrease/Change
Not sure why the implementation went down that route of creating a separation of increase
vs decrease throughout the flow. In any case, that is the back-end implementation. That can
change and evolve without affecting user code. This is the user facing API. So once code is
written against this API making backwards incompatible changes or providing new functionality
in the future needs to be considered. Also just API simplicity needs to be considered.
1) Having a changeRequest API allows us to support other kinds of changes (mix of increase
or decrease) at a later point. For now, the API could check for increase/decrease (which it
has to do anyways for sanity checking) and reject unsupported scenarios.
2) Even for just increase or decrease, given the user can provide the old Container + new
Request Size, it should be easy for the library to figure out if an increase or decrease is
needed. Why burden the user by having them call 2 different API's that are essentially making
the same request.
Having to handle container token for increase but do nothing for decrease is something that
already has to be explained to the user. Same for the fact that decrease is quick but not
increase. So those aspects of user education are probably orthogonal.

We have iterated on this API question a few times now. In the above 2 points I have tried
to summarize the reasons for requesting to change instead of increase/decrease. Even if we
dont support both increase and decrease (point 1) I think the simplicity of having a single
API (point 2) would be an advantage of having change vs increase/decrease. This also simplifies
to having a single callback onContainerResourceChanged() instead of 2 callbacks.

At this point, I will request you and [~leftnoteasy] to consider both future extensions and
API simplicity to make a final call on having 2 APIs and 2 callbacks vs 1.

bq. My thought is, since we already ask the user to provide the old container when he sends
out the change request, he should have the old container already, so we don't necessarily
have to provide the old container info in the callback method
I am fine either ways. I discussed the same thing offline with [~leftnoteasy] and he thought
that having the old container info would make life easier for the user. A user who is using
this API is very likely going to have state about the container for which a change has been
requested and can match them up using the containerId.

bq. The AbstractCallbackHandler.onError will be called when the change container request throws
exception on the RM side.
The change request is sent inside the allocate heartbeat request, right? So i am not sure
how we get an exception back for the specific case of a failed container change request. Or
are you saying that invalid container resource change requests are immediately rejected by
the RM synchronously in the allocate RPC?

bq. I am not against providing a separate cancel API. But I think the API needs to be clear
that the cancel is only for increase request, NOT decrease request (just like we don't have
something like cancel release container).
Having a simple cancel request regardless of increase or decrease is preferable since then
we are not leaking the current state of the implementation to the user. It is future safe
E.g. later if we find an issue with decreasing and fixing that makes it non-instantaneous,
then we dont want to have to change the API to support that. But today, given that it is instantaneous,
we can simply ignore the cancellation of a decrease in the cancel method of AMRMClient. I
think the RM does not support a cancel container resource change request. Does it? If it does
not, then perhaps this API can be discussed/added in a separate jira after there is back end
support for it.

> Make AMRMClient support send increase container request and get increased/decreased containers
> ----------------------------------------------------------------------------------------------
>                 Key: YARN-1509
>                 URL: https://issues.apache.org/jira/browse/YARN-1509
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan (No longer used)
>            Assignee: MENG DING
>         Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch, YARN-1509.4.patch,
> As described in YARN-1197, we need add API in AMRMClient to support
> 1) Add increase request
> 2) Can get successfully increased/decreased containers from RM

This message was sent by Atlassian JIRA

View raw message