Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3FA5218DFC for ; Tue, 29 Sep 2015 09:28:13 +0000 (UTC) Received: (qmail 66810 invoked by uid 500); 29 Sep 2015 09:28:08 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 66699 invoked by uid 500); 29 Sep 2015 09:28:08 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 66689 invoked by uid 99); 29 Sep 2015 09:28:08 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Sep 2015 09:28:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 1241118099E for ; Tue, 29 Sep 2015 09:28:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.08 X-Spam-Level: ** X-Spam-Status: No, score=2.08 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, KAM_LINEPADDING=1.2, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id DBRU_4LhK_Q3 for ; Tue, 29 Sep 2015 09:28:00 +0000 (UTC) Received: from mail-wi0-f196.google.com (mail-wi0-f196.google.com [209.85.212.196]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 2F2AB209B7 for ; Tue, 29 Sep 2015 09:28:00 +0000 (UTC) Received: by wicxq10 with SMTP id xq10so947038wic.2 for ; Tue, 29 Sep 2015 02:28:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=aHhOoX3Mw19RwP6xh/PMXdob0H44i81henRFHa87h8o=; b=cBpPTBhpwy7KZq/YD4+Xn5bKg5SZ4w6VWLkFzN8FWFcfDIDGZMPQTm0NF4iKHDI+/z JPyuhbNTsUDpQtZSZN+4lpHRZvzPoXzwWrsz4li6wfp7RWBDudHya0xR3EqNmcn67OJP TjmXgcp9c+yGNnMQteUzvtmIarNqWErfr6o/W7MtjDOBJIzWjjXPy0Ergtf8A4CvbSef i8QM1I0vsPw/uivOCRW7vZ/iNqsMOmNeRijdCYEAlkKBkZbT6LWral8OmPXqDA6UbDcu VgYkJPNVpBEtiP2eN2c1DfTibfQSFkMpYAmzy2GM3pmh6kY+NYICgBZgkSBJvVSXBsOt J1xA== X-Received: by 10.180.107.130 with SMTP id hc2mr22056944wib.92.1443518879893; Tue, 29 Sep 2015 02:27:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.11.81 with HTTP; Tue, 29 Sep 2015 02:27:30 -0700 (PDT) In-Reply-To: References: <0EE80F6F7A98A64EBD18F2BE839C9115679626ED@szxeml512-mbs.china.huawei.com> From: Namikaze Minato Date: Tue, 29 Sep 2015 11:27:30 +0200 Message-ID: Subject: Re: Concurrency control To: user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 I think Laxman should also tell us more about which application type he is running. The normal use cas of MAPREDUCE should be working as intended, but if he has for example one MAP using 100 vcores, then the second map will have to wait until the app completes. Same would happen if the applications running were spark, as spark does not free what is allocated to it. Regards, LLoyd On 29 September 2015 at 11:22, Naganarasimha G R (Naga) wrote: > Thanks Rohith for your thoughts , > But i think by this configuration it might not completely solve the > scenario mentioned by Laxman, As if the there is some time gap between first > and and the second app then though we have fairness or priority set for apps > starvation will be there. > IIUC we can think of an approach where in we can have something similar to > "yarn.scheduler.capacity..user-limit-factor" where in it can > provide the functionality like > "yarn.scheduler.capacity..app-limit-factor" : The multiple of > the queue capacity which can be configured to allow a single app to acquire > more resources. Thoughts ? > > + Naga > > > > ________________________________ > From: Rohith Sharma K S [rohithsharmaks@huawei.com] > Sent: Tuesday, September 29, 2015 14:07 > To: user@hadoop.apache.org > Subject: RE: Concurrency control > > Hi Laxman, > > > > In Hadoop-2.8(Not released yet), CapacityScheduler provides configuration > for configuring ordering policy. By configuring FAIR_ORDERING_POLICY in CS > , probably you should be able to achieve your goal i.e avoiding starving of > applications for resources. > > > > > > org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy > > An OrderingPolicy which orders SchedulableEntities for fairness (see > FairScheduler FairSharePolicy), generally, processes with lesser usage are > lesser. If sizedBasedWeight is set to true then an application with high > demand may be prioritized ahead of an application with less usage. This is > to offset the tendency to favor small apps, which could result in starvation > for large apps if many small ones enter and leave the queue continuously > (optional, default false) > > > > > > Community Issue Id : https://issues.apache.org/jira/browse/YARN-3463 > > > > Thanks & Regards > > Rohith Sharma K S > > > > From: Laxman Ch [mailto:laxman.lux@gmail.com] > Sent: 29 September 2015 13:36 > To: user@hadoop.apache.org > Subject: Re: Concurrency control > > > > Bouncing this thread again. Any other thoughts please? > > > > On 17 September 2015 at 23:21, Laxman Ch wrote: > > No Naga. That wont help. > > > > I am running two applications (app1 - 100 vcores, app2 - 100 vcores) with > same user which runs in same queue (capacity=100vcores). In this scenario, > if app1 triggers first occupies all the slots and runs longs then app2 will > starve longer. > > > > Let me reiterate my problem statement. I wanted "to control the amount of > resources (vcores, memory) used by an application SIMULTANEOUSLY" > > > > On 17 September 2015 at 22:28, Naganarasimha Garla > wrote: > > Hi Laxman, > > For the example you have stated may be we can do the following things : > > 1. Create/modify the queue with capacity and max cap set such that its > equivalent to 100 vcores. So as there is no elasticity, given application > will not be using the resources beyond the capacity configured > > 2. yarn.scheduler.capacity..minimum-user-limit-percent so that > each active user would be assured with the minimum guaranteed resources . By > default value is 100 implies no user limits are imposed. > > > > Additionally we can think of > "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" > which will enforce strict cpu usage for a given container if required. > > > > + Naga > > > > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch wrote: > > Yes. I'm already using cgroups. Cgroups helps in controlling the resources > at container level. But my requirement is more about controlling the > concurrent resource usage of an application at whole cluster level. > > > > And yes, we do configure queues properly. But, that won't help. > > > > For example, I have an application with a requirement of 1000 vcores. But, I > wanted to control this application not to go beyond 100 vcores at any point > of time in the cluster/queue. This makes that application to run longer even > when my cluster is free but I will be able meet the guaranteed SLAs of other > applications. > > > > Hope this helps to understand my question. > > > > And thanks Narasimha for quick response. > > > > On 17 September 2015 at 16:17, Naganarasimha Garla > wrote: > > Hi Laxman, > > Yes if cgroups are enabled and "yarn.scheduler.capacity.resource-calculator" > configured to DominantResourceCalculator then cpu and memory can be > controlled. > > Please Kindly furhter refer to the official documentation > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html > > > > But may be if say more about problem then we can suggest ideal > configuration, seems like capacity configuration and splitting of the queue > is not rightly done or you might refer to Fair Scheduler if you want more > fairness for container allocation for different apps. > > > > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch wrote: > > Hi, > > > > In YARN, do we have any way to control the amount of resources (vcores, > memory) used by an application SIMULTANEOUSLY. > > > > - In my cluster, noticed some large and long running mr-app occupied all the > slots of the queue and blocking other apps to get started. > > - I'm using Capacity schedulers (using hierarchical queues and preemption > disabled) > > - Using Hadoop version 2.6.0 > > - Did some googling around this and gone through configuration docs but I'm > not able to find anything that matches my requirement. > > > > If needed, I can provide more details on the usecase and problem. > > > > -- > > Thanks, > Laxman > > > > > > > > -- > > Thanks, > Laxman > > > > > > > > -- > > Thanks, > Laxman > > > > > > -- > > Thanks, > Laxman