mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Nielsen <...@qni.dk>
Subject Re: Core affinity in Mesos
Date Mon, 01 Feb 2016 09:41:41 GMT
Ben,

I agree that isolation encompass more than performance isolation, but
instead of inflating with too granular working groups, I thought we could
start work under the 'isolation' working group. The group was passive
before but had an entry in the document. I have no real preference and can
rename to 'performance isolation'.


Deepak,

We are very interested in that area as well. Placement biases based on
interference/sensitivity profiles, balancing power and load, etc. Hope that
we can get to a nice decoupled way of doing this, so those details
(analysis, objectives, etc) doesn't leak into the allocator.

Cheers,
Niklas

On Fri, Jan 29, 2016 at 8:48 PM, Deepak Vij (A) <deepak.vij@huawei.com>
wrote:

> On the similar lines, Interference-aware scheduling could be one of the
> desired capabilities from a Resource Manager like Mesos. This essentially
> is tied into the fact that all data centers/nodes are not really
> homogeneous. Typically, it is assumed that all placement choices are
> equally good. Although, different types of machines are mixed within the
> same cluster, and co-located tasks compete for resources, which leads to
> negative interference.
>
> In order to solve Interference-aware scheduling problem, one might have to
> periodically monitor running tasks performance and use the information
> collected to make better future scheduling decisions. Having explicit
> information about the environment helps make optimal choices for
> co-scheduling and workload partitioning, and may yield superior performance
> on many common workloads. Collected detailed resource utilization and
> performance profiles from running tasks could be things such as measuring
> CPU and memory usage, cache misses etc. etc.
>
> My question is would such Interference-aware scheduling capability fit
> into the similar category or it should be something separate altogether.
> Thanks.
>
> Regards,
> Deepak Vij
> (Huawei Software Lab., Santa Clara)
>
> -----Original Message-----
> From: Kevin Klues [mailto:klueska@gmail.com]
> Sent: Friday, January 29, 2016 11:28 AM
> To: dev@mesos.apache.org
> Subject: Re: Core affinity in Mesos
>
> I agree. "Isolation" on it's own is too broad a term. However, since
> we are talking mostly about reducing interference, which typically
> implies performance isolation, my vote for the group name is the
> "Performance Isolation Working Group".
>
> On Fri, Jan 29, 2016 at 11:22 AM, Benjamin Mahler <bmahler@apache.org>
> wrote:
> > Since "Isolation" applies broadly outside of the context of addressing
> > latency sensitive workloads (e.g. user/pid/network namespacing,
> > resource limitations (e.g. cpu quota, memory limits, gpu device
> visibility) it
> > would be great to choose a more specific name. Some suggestions:
> > interference, performance-related isolation, colocation, latency
> > sensitivity.
> >
> > Thoughts?
> >
> > Looking forward to seeing the discussions here!
> >
> > Ben
> >
> > On Friday, January 22, 2016, Nielsen, Niklas <niklas.nielsen@intel.com>
> > wrote:
> >
> >> Hi everyone,
> >>
> >> We have been talking about core affinity in Mesos for a while, and Ian
> D.
> >> has recently been giving this topic thought in his ‘exclusive resources’
> >> proposal [1].
> >> Trying to avoid too conservative placements, latency critical workloads
> >> are at risk without it.
> >> We are interested in the topic through our work on oversubscription in
> >> Serenity [2], as oversubscription was exactly to be able to colocate
> >> latency critical and best-effort batch jobs.
> >> We had an informal meeting yesterday, going over the proposal and trying
> >> to get some cadence behind the capability.
> >>
> >> It is a tricky but exciting topic:
> >>  - How do we avoid making task launch even more complex? How do we
> express
> >> the topology and acquire parts of it. Do we use hints on the affinity
> >> properties instead?
> >>  - How do we mix pinned with normal ‘floating’ tasks.
> >>  - How do we convey information to the resource estimator about the task
> >> sensitivity.
> >>
> >> Note, above list not meant for inlined discussion or answers. Let’s
> >> collect feedback on the proposals themselves.
> >>
> >> Here are our proposed next steps:
> >>  - We are going to use the ‘Isolation Working Group’ as an umbrella for
> >> this. I will fill in details and members.
> >>  - We will schedule an online meeting within the Wednesday 9AM PST next
> >> week discussing next steps. I will share a hangout link when we get
> closer.
> >>  - Plan being, getting to designs (maybe more than one) we agree on and
> >> then scope out and distribute the work needed to be done.
> >>
> >> Who ever is interested, join us. The use cases for this work are
> critical.
> >> Maybe we can even work on some representative workloads we can verify
> our
> >> proposal against.
> >>
> >> Cheers,
> >> Niklas
> >>
> >> PS For comments on the proposal itself, please refer to Ian’s thread for
> >> the dev list [3].
> >>
> >> [1] https://issues.apache.org/jira/browse/MESOS-4138
> >> [2] https://github.com/mesosphere/serenity
> >> [3] https://www.mail-archive.com/dev%40mesos.apache.org/msg33892.html
> >>
>
>
>
> --
> ~Kevin
>



-- 
Niklas

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message