hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szilard Nemeth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-8202) DefaultAMSProcessor should properly check units of requested custom resource types against minimum/maximum allocation
Date Tue, 08 May 2018 21:06:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Szilard Nemeth updated YARN-8202:
---------------------------------
    Attachment: YARN-8202-009.patch

> DefaultAMSProcessor should properly check units of requested custom resource types against
minimum/maximum allocation
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8202
>                 URL: https://issues.apache.org/jira/browse/YARN-8202
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Blocker
>         Attachments: YARN-8202-001.patch, YARN-8202-002.patch, YARN-8202-003.patch, YARN-8202-004.patch,
YARN-8202-005.patch, YARN-8202-006.patch, YARN-8202-007.patch, YARN-8202-008.patch, YARN-8202-009.patch
>
>
>  
> When I execute a pi job with arguments: 
> {code:java}
> -Dmapreduce.map.resource.memory-mb=200 -Dmapreduce.map.resource.resource1=500M 1 1000{code}
> and I have one node with 5GB of resource1, I get the following exception on every second
and the job hangs:
> {code:java}
> 2018-04-24 08:42:03,694 INFO org.apache.hadoop.ipc.Server: IPC Server handler 20 on 8030,
call Call#386 Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from
172.31.119.172:58138
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request,
requested resource type=[resource1] < 0 or greater than maximum allowed allocation. Requested
resource=<memory:200, vCores:1, resource1: 500M>, maximum allowed allocation=<memory:6144,
vCores:8, resource1: 5G>, please note that maximum allowed allocation is calculated by
scheduler based on maximum resource of registered NodeManagers, which might be less than configured
maximum allocation=<memory:8192, vCores:8192, resource1: 9223372036854775807G>
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:286)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:242)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:258)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:249)
>         at org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:230)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>         at org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>         at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>         at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>         at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> {code}
> *This is because org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils#validateResourceRequest
does not take resource units into account.*
>  
> However, if I start a job with arguments: 
> {code:java}
> -Dmapreduce.map.resource.memory-mb=200 -Dmapreduce.map.resource.resource1=1G 1 1000{code}
> and I still have 5GB of resource1 on one node then the job runs successfully.
>  
> I also tried a third job run, when I request 1GB of resource1 and I have no nodes with
any amount of resource1, then I restart the node with 5GBs of resource1, the job ultimately
completes, but just after the node with enough resources registered in RM, which is the desired
behaviour.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message