hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2885) Create AMRMProxy request interceptor for distributed scheduling decisions for queueable containers
Date Thu, 03 Dec 2015 01:43:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037045#comment-15037045
] 

Wangda Tan commented on YARN-2885:
----------------------------------

Thanks [~asuresh] working on this JIRA, took a quick glance at your patch, some questions/comments:

1)
[~sriramsrao] mentioned at: https://issues.apache.org/jira/browse/YARN-2877?focusedCommentId=14221991&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14221991
bq. Capacity is enforced for guaranteed-start containers. For queueable containers, policies
could be pushed down from central-RM (YARN-2885)

I'm not sure if it is possibly that queueable resource requests could be also sent to RM with
this implementation.

2) I'm not quite sure why isDistributedSchedulingEnabled is required for AM's AllocateRequest
and RegisterRequest. In my mind if AM doesn't want queueable container, it should simply do
not send queueable resource request. If you agree with 1), AM should be agnostic to a container
is allocated by a RM or NM, it should simply know an allocated container is queueable or guaranteed.

3) Why adding separated configurations for distributed scheduling, such as:
bq. YarnConfiguration.DIST_SCHEDULING_ENABLED
IIUC, ApplicationMasterService is running at resource manager, am I correct?

4) Some questions/suggestions regarding RegisterApplicationMasterResponse:
- Add a separated class to encapsulate all queueable-request related information. It will
be null if distributed scheduling is disabled.
- Such information could be changed during application master's lifespan, so do you think
if we need to add such information to AllocateResponse?
- What's the getMinAllocatableCapabilty and getMaxAllocatableCapabilty? Is it as same as minimumAllocation/maximumAllocation?
If so, why not use the RM's minimumAllocation/maximumAllocation?
- Why AM needs to know getContainerIdStart?
- Is it possible containerTokenExpiryInterval could be varies at different NMs? If so, is
it better to add expiryInterval to created container?
- getNodeList is not clear enough, maybe call it getQueueableSupportedNodesList?

5) Could you make API changes to a independent patch? I think other features such as centralized
resource over-subscription (YARN-1011) could leverage the same set of APIs. 

> Create AMRMProxy request interceptor for distributed scheduling decisions for queueable
containers
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2885
>                 URL: https://issues.apache.org/jira/browse/YARN-2885
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Konstantinos Karanasos
>            Assignee: Arun Suresh
>         Attachments: YARN-2885-yarn-2877.001.patch
>
>
> We propose to add a Local ResourceManager (LocalRM) to the NM in order to support distributed
scheduling decisions. 
> Architecturally we leverage the RMProxy, introduced in YARN-2884. 
> The LocalRM makes distributed decisions for queuable containers requests. 
> Guaranteed-start requests are still handled by the central RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message