hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Botong Huang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7631) ResourceRequest with different Capacity (Resource) overrides each other in RM
Date Sat, 09 Dec 2017 00:40:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Botong Huang updated YARN-7631:
-------------------------------
    Attachment: resourcebug.patch

> ResourceRequest with different Capacity (Resource) overrides each other in RM
> -----------------------------------------------------------------------------
>
>                 Key: YARN-7631
>                 URL: https://issues.apache.org/jira/browse/YARN-7631
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Botong Huang
>         Attachments: resourcebug.patch
>
>
> Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> Priority
-> ResourceName -> ExecutionType -> Resource (Capacity) -> ResourceRequestInfo
(the actual RR). 
> This means that only RRs with the same (requestId, priority, resourcename, executionType,
resource) will be grouped and aggregated together. 
> While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> LocalityAppPlacementAllocator
(ResourceName -> RR). 
> The issue is that in RM side Resource is not in the key to the RR at all. (Note that
executionType is also not in the RM side, but it is fine because RM handles it separately
as container update requests.) This means that under the same value of (requestId, priority,
resourcename), RRs with different Resource values will be grouped together and override each
other in RM. As a result, some of the container requests are lost and will never be allocated.
Furthermore, since the two RRs are kept under different keys in AMRMClient side, allocation
of RR1 will only trigger cancel for RR1, the pending RR2 will not get resend as well. 
> I’ve attached an unit test (resourcebug.patch) which is failing in trunk to illustrate
this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message