aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: Non-exclusive dedicated constraint
Date Wed, 20 Jan 2016 04:48:58 GMT
Oh, I didn't mean the memory GC pressure in the pure sense, rather a
logical garbage of orphaned hosts that never leave the scheduler. It's
not something to be concerned about from the performance standpoint.
It's, however, something operators need to be aware of when a host
from a dedicated pool gets dropped or replaced.

On Tue, Jan 19, 2016 at 8:39 PM, Bill Farner <wfarner@apache.org> wrote:
> What do you mean by GC burden?  What i'm proposing is effectively
> Map<String, String>.  Even with an extremely forgetful operator (even more
> than Joe!), it would require a huge oversight to put a dent in heap usage.
> I'm sure there are ways we could even expose a useful stat to flag such an
> oversight.
>
> On Tue, Jan 19, 2016 at 8:31 PM, Maxim Khutornenko <maxim@apache.org> wrote:
>
>> Right, that's what I thought. Yes, it sounds interesting. My only
>> concern is the GC burden of getting rid of hostnames that are obsolete
>> and no longer exist. Relying on offers to update hostname 'relevance'
>> may not work as dedicated hosts may be fully packed and not release
>> any resources for a very long time. Let me explore this idea a bit to
>> see what it would take to implement.
>>
>> On Tue, Jan 19, 2016 at 8:22 PM, Bill Farner <wfarner@apache.org> wrote:
>> > Not a host->attribute mapping (attribute in the mesos sense, anyway).
>> Rather
>> > an out-of-band API for marking machines as reserved.  For task->offer
>> > mapping it's just a matter of another data source.  Does that make sense?
>> >
>> > On Tuesday, January 19, 2016, Maxim Khutornenko <maxim@apache.org>
>> wrote:
>> >
>> >> >
>> >> > Can't this just be any old Constraint (not named "dedicated").  In
>> other
>> >> > words, doesn't this code already deal with non-dedicated constraints?:
>> >> >
>> >> >
>> >>
>> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197
>> >>
>> >>
>> >> Not really. There is a subtle difference here. A regular (non-dedicated)
>> >> constraint does not prevent other tasks from landing on a given machine
>> set
>> >> whereas dedicated keeps other tasks away by only allowing those matching
>> >> the dedicated attribute. What this proposal targets is allowing
>> exclusive
>> >> machine pool matching any job that has this new constraint while keeping
>> >> all other tasks that don't have that attribute away.
>> >>
>> >> Following an example from my original post, imagine a GPU machine pool.
>> Any
>> >> job (from any role) requiring GPU resource would be allowed while all
>> other
>> >> jobs that don't have that constraint would be vetoed.
>> >>
>> >> Also, regarding dedicated constraints necessitating a slave restart -
>> i've
>> >> > pondered moving dedicated machine management to the scheduler for
>> similar
>> >> > purposes.  There's not really much forcing that behavior to be managed
>> >> with
>> >> > a slave attribute.
>> >>
>> >>
>> >> Would you mind giving a few more hints on the mechanics behind this? How
>> >> would scheduler know about dedicated hw without the slave attributes
>> set?
>> >> Are you proposing storing hostname->attribute mapping in the scheduler
>> >> store?
>> >>
>> >> On Tue, Jan 19, 2016 at 7:53 PM, Bill Farner <wfarner@apache.org
>> >> <javascript:;>> wrote:
>> >>
>> >> > Joe - if you want to pursue this, I suggest you start another thread
>> to
>> >> > keep this thread's discussion in tact.  I will not be able to lead
>> this
>> >> > change, but can certainly shepherd!
>> >> >
>> >> > On Tuesday, January 19, 2016, Joe Smith <yasumoto7@gmail.com
>> >> <javascript:;>> wrote:
>> >> >
>> >> > > As an operator, that'd be a relatively simple change in tooling,
and
>> >> the
>> >> > > benefits of not forcing a slave restart would be _huge_.
>> >> > >
>> >> > > Keeping the dedicated semantics (but adding non-exclusive) would
be
>> >> ideal
>> >> > > if possible.
>> >> > >
>> >> > > > On Jan 19, 2016, at 19:09, Bill Farner <wfarner@apache.org
>> >> <javascript:;>
>> >> > > <javascript:;>> wrote:
>> >> > > >
>> >> > > > Also, regarding dedicated constraints necessitating a slave
>> restart -
>> >> > > i've
>> >> > > > pondered moving dedicated machine management to the scheduler
for
>> >> > similar
>> >> > > > purposes.  There's not really much forcing that behavior
to be
>> >> managed
>> >> > > with
>> >> > > > a slave attribute.
>> >> > > >
>> >> > > > On Tue, Jan 19, 2016 at 7:05 PM, John Sirois <john@conductant.com
>> >> <javascript:;>
>> >> > > <javascript:;>> wrote:
>> >> > > >
>> >> > > >> On Tue, Jan 19, 2016 at 7:22 PM, Maxim Khutornenko <
>> >> maxim@apache.org <javascript:;>
>> >> > > <javascript:;>>
>> >> > > >> wrote:
>> >> > > >>
>> >> > > >>> Has anyone explored an idea of having a non-exclusive
(wrt job
>> >> role)
>> >> > > >>> dedicated constraint in Aurora before?
>> >> > > >>
>> >> > > >>
>> >> > > >>> We do have a dedicated constraint now but it assumes
a 1:1
>> >> > > >>> relationship between a job role and a slave attribute
[1]. For
>> >> > > >>> example: a 'www-data/prod/hello' job with a dedicated
>> constraint of
>> >> > > >>> 'dedicated': 'www-data/hello' may only be pinned
to a particular
>> >> set
>> >> > > >>> of slaves if all of them have 'www-data/hello' attribute
set. No
>> >> > other
>> >> > > >>> role tasks will be able to land on those slaves unless
their
>> >> > > >>> 'role/name' pair is added into the slave attribute
set.
>> >> > > >>>
>> >> > > >>> The above is very limiting as it prevents carving
out subsets
>> of a
>> >> > > >>> shared pool cluster to be used by multiple roles
at the same
>> time.
>> >> > > >>> Would it make sense to have a free-form dedicated
constraint not
>> >> > bound
>> >> > > >>> to a particular role? Multiple jobs could then use
this type of
>> >> > > >>> constraint dynamically without modifying the slave
command line
>> >> (and
>> >> > > >>> requiring slave restart).
>> >> > > >>
>> >> > > >> Can't this just be any old Constraint (not named "dedicated").
>> In
>> >> > other
>> >> > > >> words, doesn't this code already deal with non-dedicated
>> >> constraints?:
>> >> > > >>
>> >> > > >>
>> >> > >
>> >> >
>> >>
>> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java#L193-L197
>> >> > > >>
>> >> > > >>
>> >> > > >>> This could be quite useful for experimenting purposes
(e.g.
>> >> different
>> >> > > >>> host OS) or to target a different hardware offering
(e.g.
>> GPUs). In
>> >> > > >>> other words, only those jobs that explicitly opt-in
to
>> participate
>> >> in
>> >> > > >>> an experiment or hw offering would be landing on
that slave set.
>> >> > > >>>
>> >> > > >>> Thanks,
>> >> > > >>> Maxim
>> >> > > >>>
>> >> > > >>> [1]-
>> >> > > >>
>> >> > >
>> >> >
>> >>
>> https://github.com/apache/aurora/blob/eec985d948f02f46637d87cd4d212eb2a70ef8d0/src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java#L272-L276
>> >> > > >>
>> >> > > >>
>> >> > > >>
>> >> > > >> --
>> >> > > >> John Sirois
>> >> > > >> 303-512-3301
>> >> > > >>
>> >> > >
>> >> >
>> >>
>>

Mime
View raw message