Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B46B318D43 for ; Wed, 20 Jan 2016 04:19:17 +0000 (UTC) Received: (qmail 32486 invoked by uid 500); 20 Jan 2016 04:19:17 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 32435 invoked by uid 500); 20 Jan 2016 04:19:17 -0000 Mailing-List: contact dev-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list dev@aurora.apache.org Received: (qmail 32423 invoked by uid 99); 20 Jan 2016 04:19:17 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jan 2016 04:19:17 +0000 Received: from mail-ig0-f172.google.com (mail-ig0-f172.google.com [209.85.213.172]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 061531A0040 for ; Wed, 20 Jan 2016 04:19:16 +0000 (UTC) Received: by mail-ig0-f172.google.com with SMTP id t15so93479887igr.0 for ; Tue, 19 Jan 2016 20:19:16 -0800 (PST) X-Gm-Message-State: AG10YOR364+DWyOKWDt1cPqfipIoAXGlhTFJ8WexhXovLLon5xqLTwK0JAM1G4TpPm1gbekvgNA2wCJ7yo306w== MIME-Version: 1.0 X-Received: by 10.51.17.33 with SMTP id gb1mr1607977igd.91.1453263556431; Tue, 19 Jan 2016 20:19:16 -0800 (PST) Received: by 10.107.12.37 with HTTP; Tue, 19 Jan 2016 20:19:16 -0800 (PST) In-Reply-To: References: <854CEFF1-E22F-4C9A-9AD0-674CAB418311@gmail.com> Date: Tue, 19 Jan 2016 20:19:16 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Non-exclusive dedicated constraint From: Maxim Khutornenko To: dev@aurora.apache.org Content-Type: multipart/alternative; boundary=001a1134b3d40c004f0529bc4a38 --001a1134b3d40c004f0529bc4a38 Content-Type: text/plain; charset=UTF-8 > > Can't this just be any old Constraint (not named "dedicated"). In other > words, doesn't this code already deal with non-dedicated constraints?: > > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197 Not really. There is a subtle difference here. A regular (non-dedicated) constraint does not prevent other tasks from landing on a given machine set whereas dedicated keeps other tasks away by only allowing those matching the dedicated attribute. What this proposal targets is allowing exclusive machine pool matching any job that has this new constraint while keeping all other tasks that don't have that attribute away. Following an example from my original post, imagine a GPU machine pool. Any job (from any role) requiring GPU resource would be allowed while all other jobs that don't have that constraint would be vetoed. Also, regarding dedicated constraints necessitating a slave restart - i've > pondered moving dedicated machine management to the scheduler for similar > purposes. There's not really much forcing that behavior to be managed with > a slave attribute. Would you mind giving a few more hints on the mechanics behind this? How would scheduler know about dedicated hw without the slave attributes set? Are you proposing storing hostname->attribute mapping in the scheduler store? On Tue, Jan 19, 2016 at 7:53 PM, Bill Farner wrote: > Joe - if you want to pursue this, I suggest you start another thread to > keep this thread's discussion in tact. I will not be able to lead this > change, but can certainly shepherd! > > On Tuesday, January 19, 2016, Joe Smith wrote: > > > As an operator, that'd be a relatively simple change in tooling, and the > > benefits of not forcing a slave restart would be _huge_. > > > > Keeping the dedicated semantics (but adding non-exclusive) would be ideal > > if possible. > > > > > On Jan 19, 2016, at 19:09, Bill Farner > > wrote: > > > > > > Also, regarding dedicated constraints necessitating a slave restart - > > i've > > > pondered moving dedicated machine management to the scheduler for > similar > > > purposes. There's not really much forcing that behavior to be managed > > with > > > a slave attribute. > > > > > > On Tue, Jan 19, 2016 at 7:05 PM, John Sirois > > wrote: > > > > > >> On Tue, Jan 19, 2016 at 7:22 PM, Maxim Khutornenko > > > > >> wrote: > > >> > > >>> Has anyone explored an idea of having a non-exclusive (wrt job role) > > >>> dedicated constraint in Aurora before? > > >> > > >> > > >>> We do have a dedicated constraint now but it assumes a 1:1 > > >>> relationship between a job role and a slave attribute [1]. For > > >>> example: a 'www-data/prod/hello' job with a dedicated constraint of > > >>> 'dedicated': 'www-data/hello' may only be pinned to a particular set > > >>> of slaves if all of them have 'www-data/hello' attribute set. No > other > > >>> role tasks will be able to land on those slaves unless their > > >>> 'role/name' pair is added into the slave attribute set. > > >>> > > >>> The above is very limiting as it prevents carving out subsets of a > > >>> shared pool cluster to be used by multiple roles at the same time. > > >>> Would it make sense to have a free-form dedicated constraint not > bound > > >>> to a particular role? Multiple jobs could then use this type of > > >>> constraint dynamically without modifying the slave command line (and > > >>> requiring slave restart). > > >> > > >> Can't this just be any old Constraint (not named "dedicated"). In > other > > >> words, doesn't this code already deal with non-dedicated constraints?: > > >> > > >> > > > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197 > > >> > > >> > > >>> This could be quite useful for experimenting purposes (e.g. different > > >>> host OS) or to target a different hardware offering (e.g. GPUs). In > > >>> other words, only those jobs that explicitly opt-in to participate in > > >>> an experiment or hw offering would be landing on that slave set. > > >>> > > >>> Thanks, > > >>> Maxim > > >>> > > >>> [1]- > > >> > > > https://github.com/apache/aurora/blob/eec985d948f02f46637d87cd4d212eb2a70ef8d0/src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java#L272-L276 > > >> > > >> > > >> > > >> -- > > >> John Sirois > > >> 303-512-3301 > > >> > > > --001a1134b3d40c004f0529bc4a38--