aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: Non-exclusive dedicated constraint
Date Wed, 20 Jan 2016 04:31:43 GMT
Right, that's what I thought. Yes, it sounds interesting. My only
concern is the GC burden of getting rid of hostnames that are obsolete
and no longer exist. Relying on offers to update hostname 'relevance'
may not work as dedicated hosts may be fully packed and not release
any resources for a very long time. Let me explore this idea a bit to
see what it would take to implement.

On Tue, Jan 19, 2016 at 8:22 PM, Bill Farner <wfarner@apache.org> wrote:
> Not a host->attribute mapping (attribute in the mesos sense, anyway).  Rather
> an out-of-band API for marking machines as reserved.  For task->offer
> mapping it's just a matter of another data source.  Does that make sense?
>
> On Tuesday, January 19, 2016, Maxim Khutornenko <maxim@apache.org> wrote:
>
>> >
>> > Can't this just be any old Constraint (not named "dedicated").  In other
>> > words, doesn't this code already deal with non-dedicated constraints?:
>> >
>> >
>> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197
>>
>>
>> Not really. There is a subtle difference here. A regular (non-dedicated)
>> constraint does not prevent other tasks from landing on a given machine set
>> whereas dedicated keeps other tasks away by only allowing those matching
>> the dedicated attribute. What this proposal targets is allowing exclusive
>> machine pool matching any job that has this new constraint while keeping
>> all other tasks that don't have that attribute away.
>>
>> Following an example from my original post, imagine a GPU machine pool. Any
>> job (from any role) requiring GPU resource would be allowed while all other
>> jobs that don't have that constraint would be vetoed.
>>
>> Also, regarding dedicated constraints necessitating a slave restart - i've
>> > pondered moving dedicated machine management to the scheduler for similar
>> > purposes.  There's not really much forcing that behavior to be managed
>> with
>> > a slave attribute.
>>
>>
>> Would you mind giving a few more hints on the mechanics behind this? How
>> would scheduler know about dedicated hw without the slave attributes set?
>> Are you proposing storing hostname->attribute mapping in the scheduler
>> store?
>>
>> On Tue, Jan 19, 2016 at 7:53 PM, Bill Farner <wfarner@apache.org
>> <javascript:;>> wrote:
>>
>> > Joe - if you want to pursue this, I suggest you start another thread to
>> > keep this thread's discussion in tact.  I will not be able to lead this
>> > change, but can certainly shepherd!
>> >
>> > On Tuesday, January 19, 2016, Joe Smith <yasumoto7@gmail.com
>> <javascript:;>> wrote:
>> >
>> > > As an operator, that'd be a relatively simple change in tooling, and
>> the
>> > > benefits of not forcing a slave restart would be _huge_.
>> > >
>> > > Keeping the dedicated semantics (but adding non-exclusive) would be
>> ideal
>> > > if possible.
>> > >
>> > > > On Jan 19, 2016, at 19:09, Bill Farner <wfarner@apache.org
>> <javascript:;>
>> > > <javascript:;>> wrote:
>> > > >
>> > > > Also, regarding dedicated constraints necessitating a slave restart
-
>> > > i've
>> > > > pondered moving dedicated machine management to the scheduler for
>> > similar
>> > > > purposes.  There's not really much forcing that behavior to be
>> managed
>> > > with
>> > > > a slave attribute.
>> > > >
>> > > > On Tue, Jan 19, 2016 at 7:05 PM, John Sirois <john@conductant.com
>> <javascript:;>
>> > > <javascript:;>> wrote:
>> > > >
>> > > >> On Tue, Jan 19, 2016 at 7:22 PM, Maxim Khutornenko <
>> maxim@apache.org <javascript:;>
>> > > <javascript:;>>
>> > > >> wrote:
>> > > >>
>> > > >>> Has anyone explored an idea of having a non-exclusive (wrt
job
>> role)
>> > > >>> dedicated constraint in Aurora before?
>> > > >>
>> > > >>
>> > > >>> We do have a dedicated constraint now but it assumes a 1:1
>> > > >>> relationship between a job role and a slave attribute [1].
For
>> > > >>> example: a 'www-data/prod/hello' job with a dedicated constraint
of
>> > > >>> 'dedicated': 'www-data/hello' may only be pinned to a particular
>> set
>> > > >>> of slaves if all of them have 'www-data/hello' attribute set.
No
>> > other
>> > > >>> role tasks will be able to land on those slaves unless their
>> > > >>> 'role/name' pair is added into the slave attribute set.
>> > > >>>
>> > > >>> The above is very limiting as it prevents carving out subsets
of a
>> > > >>> shared pool cluster to be used by multiple roles at the same
time.
>> > > >>> Would it make sense to have a free-form dedicated constraint
not
>> > bound
>> > > >>> to a particular role? Multiple jobs could then use this type
of
>> > > >>> constraint dynamically without modifying the slave command
line
>> (and
>> > > >>> requiring slave restart).
>> > > >>
>> > > >> Can't this just be any old Constraint (not named "dedicated").
 In
>> > other
>> > > >> words, doesn't this code already deal with non-dedicated
>> constraints?:
>> > > >>
>> > > >>
>> > >
>> >
>> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197
>> > > >>
>> > > >>
>> > > >>> This could be quite useful for experimenting purposes (e.g.
>> different
>> > > >>> host OS) or to target a different hardware offering (e.g.
GPUs). In
>> > > >>> other words, only those jobs that explicitly opt-in to participate
>> in
>> > > >>> an experiment or hw offering would be landing on that slave
set.
>> > > >>>
>> > > >>> Thanks,
>> > > >>> Maxim
>> > > >>>
>> > > >>> [1]-
>> > > >>
>> > >
>> >
>> https://github.com/apache/aurora/blob/eec985d948f02f46637d87cd4d212eb2a70ef8d0/src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java#L272-L276
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> John Sirois
>> > > >> 303-512-3301
>> > > >>
>> > >
>> >
>>

Mime
View raw message