Adam Kawa created YARN-2230: ------------------------------- Summary: Fix description of yarn.scheduler.maximum-allocation-vcores in yarn-default.xml (or code to show) Key: YARN-2230 URL: https://issues.apache.org/jira/browse/YARN-2230 Project: Hadoop YARN Issue Type: Bug Reporter: Adam Kawa Priority: Minor When a user requests more vcores than the allocation limit (e.g. mapreduce.map.cpu.vcores is larger than yarn.scheduler.maximum-allocation-vcores), then InvalidResourceRequestException is thrown - https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java {code} if (resReq.getCapability().getVirtualCores() < 0 || resReq.getCapability().getVirtualCores() > maximumResource.getVirtualCores()) { throw new InvalidResourceRequestException("Invalid resource request" + ", requested virtual cores < 0" + ", or requested virtual cores > max configured" + ", requestedVirtualCores=" + resReq.getCapability().getVirtualCores() + ", maxVirtualCores=" + maximumResource.getVirtualCores()); } {code} According to documentation - yarn-default.xml http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml, the request will be capped to the allocation limit: {code} The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value. yarn.scheduler.maximum-allocation-vcores 32 {code} * Either documentation or code should be corrected (unless this exception is handled elsewhere accordingly, but it looks that it is not). This behavior is confusing, because when such a job (with mapreduce.map.cpu.vcores is larger than yarn.scheduler.maximum-allocation-vcores) is submitted, it does not make any progress. The warnings/exceptions are thrown at the scheduler (RM) side e.g. {code} 2014-06-29 00:34:51,469 WARN org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Invalid resource ask by application appattempt_1403993411503_0002_000001 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=32, maxVirtualCores=3 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:237) at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateResourceRequests(RMServerUtils.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:420) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) {code} * IMHO, such an exception should be forwarded to client. Otherwise, it is non obvious to discover why a job does not make any progress. The same looks to be related to memory. -- This message was sent by Atlassian JIRA (v6.2#6252)