Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4DD11DBE8 for ; Wed, 17 Oct 2012 15:42:06 +0000 (UTC) Received: (qmail 94967 invoked by uid 500); 17 Oct 2012 15:42:06 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 94923 invoked by uid 500); 17 Oct 2012 15:42:06 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 94914 invoked by uid 99); 17 Oct 2012 15:42:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Oct 2012 15:42:06 +0000 Date: Wed, 17 Oct 2012 15:42:06 +0000 (UTC) From: "Robert Joseph Evans (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: <1911445875.58207.1350488526213.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477965#comment-13477965 ] Robert Joseph Evans commented on YARN-2: ---------------------------------------- Arun, I still disagree with the #cores being an int. What does requesting 1 CPU really mean and how is it different from requesting 1.8? To me 1 CPU means that for this particular container I want to be guaranteed that it gets at least 1 full CPU core to itself for computation at any point in time it needs it, very similar to what requesting 3000MB of memory does. It is a bit more ambiguous because 1 CPU on box A is not necessarily equivalent to 1 CPU on box B. But this JIRA already makes the assumption that they are close enough to being equivalent. It gives me as a user of the container a chance to set a lower bound on the amount of resources that I am guaranteed to be able to use. In practice this probably means that the kernel will give at least X% of the available CPU time to the processes running in that container, if those processes are runnable, where X = CPU requested/Total CPU cores on the box. 1.8 CPUs to me means a few things. First the person requesting this was either a machine or was overly ambitious in trying to get an exact value. Second the container will probably get 2 CPU cores, because just like with memory I would expect the scheduler to round it up to the nearest multiple of a scheduling unit. I proposed initially that quarter or even half CPU marks are probably sufficient. We can always round up and remove precision with a float. It is very hard to go back the other way though and add precision to an int. I am fine with the first go around the CPU number is in float and the scheduling unit is 1 CPU. I just want the door left open so we can easily adjust things if we find a need to. Over-subscribing makes since but it also has a lot of pitfalls. You have to take into account that resource utilization is not constant. A process can use very little of a resource and then all of a sudden it starts to use lots of that resource. Is the Resource request a guarantee of those resources, or is it just a good effort to provide those resources? I see situations where users would what both, and perhaps if we do support over-subscribing we need to support something like nice on POSIX. > Enhance CS to schedule accounting for both memory and cpu cores > --------------------------------------------------------------- > > Key: YARN-2 > URL: https://issues.apache.org/jira/browse/YARN-2 > Project: Hadoop YARN > Issue Type: New Feature > Components: capacityscheduler, scheduler > Reporter: Arun C Murthy > Assignee: Arun C Murthy > Fix For: 2.0.3-alpha > > Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch > > > With YARN being a general purpose system, it would be useful for several applications (MPI et al) to specify not just memory but also CPU (cores) for their resource requirements. Thus, it would be useful to the CapacityScheduler to account for both. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira