Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BFCC0174CF for ; Mon, 5 Jan 2015 22:58:34 +0000 (UTC) Received: (qmail 61929 invoked by uid 500); 5 Jan 2015 22:58:35 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 61882 invoked by uid 500); 5 Jan 2015 22:58:35 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 61871 invoked by uid 99); 5 Jan 2015 22:58:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Jan 2015 22:58:35 +0000 Date: Mon, 5 Jan 2015 22:58:35 +0000 (UTC) From: "Vijay Bhat (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2230) Fix description of yarn.scheduler.maximum-allocation-vcores in yarn-default.xml (or code) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265276#comment-14265276 ] Vijay Bhat commented on YARN-2230: ---------------------------------- Jian, I (Vijay Bhat) can definitely take care of that. Also, since this is my first submission, I wanted to clarify - is the protocol that I assign the JIRA to myself once I submit the patch? Apologies for any confusion. Thanks! > Fix description of yarn.scheduler.maximum-allocation-vcores in yarn-default.xml (or code) > ----------------------------------------------------------------------------------------- > > Key: YARN-2230 > URL: https://issues.apache.org/jira/browse/YARN-2230 > Project: Hadoop YARN > Issue Type: Bug > Components: client, documentation, scheduler > Affects Versions: 2.4.0 > Reporter: Adam Kawa > Priority: Minor > Attachments: YARN-2230.001.patch > > > When a user requests more vcores than the allocation limit (e.g. mapreduce.map.cpu.vcores is larger than yarn.scheduler.maximum-allocation-vcores), then InvalidResourceRequestException is thrown - https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java > {code} > if (resReq.getCapability().getVirtualCores() < 0 || > resReq.getCapability().getVirtualCores() > > maximumResource.getVirtualCores()) { > throw new InvalidResourceRequestException("Invalid resource request" > + ", requested virtual cores < 0" > + ", or requested virtual cores > max configured" > + ", requestedVirtualCores=" > + resReq.getCapability().getVirtualCores() > + ", maxVirtualCores=" + maximumResource.getVirtualCores()); > } > {code} > According to documentation - yarn-default.xml http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml, the request should be capped to the allocation limit. > {code} > > The maximum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests higher than this won't take effect, > and will get capped to this value. > yarn.scheduler.maximum-allocation-vcores > 32 > > {code} > This means that: > * Either documentation or code should be corrected (unless this exception is handled elsewhere accordingly, but it looks that it is not). > This behavior is confusing, because when such a job (with mapreduce.map.cpu.vcores is larger than yarn.scheduler.maximum-allocation-vcores) is submitted, it does not make any progress. The warnings/exceptions are thrown at the scheduler (RM) side e.g. > {code} > 2014-06-29 00:34:51,469 WARN org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Invalid resource ask by application appattempt_1403993411503_0002_000001 > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=32, maxVirtualCores=3 > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:237) > at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateResourceRequests(RMServerUtils.java:80) > at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:420) > ..... > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:416) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) > {code} > * IMHO, such an exception should be forwarded to client. Otherwise, it is non obvious to discover why a job does not make any progress. > The same looks to be related to memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)