aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McLaughlin <dmclaugh...@apache.org>
Subject Re: Dynamic Reservations
Date Thu, 09 Mar 2017 02:34:13 GMT
Spoke with Zameer offline and he asked me to post additional thoughts here.

My motivation for solving this without dynamic reservations is just the
sheer number of questions I have after reading the RFC and current design
doc. And most of them are not about the current proposal and goals or the
MVP but more about how this feature will scale into persistent storage.

I think best-effort dynamic reservations are such a different problem than
the reservations that would be needed to support persistent storage. My
primary concern is around things like quota. For the current proposal and
the small best-effort feature we're adding, it makes no sense to get into
the complexities of separate quota for reserved resources vs preferred
resources, but the reality of exposing such a concept to a large
organisation where we can't automatically reclaim anything reserved means
we'd almost definitely want that. The issue with the iterative approach is
decisions we take here could have a huge impact on those tasks later, once
we expose the reserved tier into the open. That means more upfront design
and planning, which so far has blocked a super useful feature that I feel
all of us want.

My gut feeling is we went about this all wrong. We started with dynamic
reservations and thought about how we could speed up task scheduling with
them. If we took the current problem brief and started from first
principals then I think we'd naturally look for something like a
replaceTask(offerId, taskInfo) type API from Mesos.

I'll bring this up within our team and see if we can put resources on
adding such an API. Any feedback on this approach in the meantime is
welcome.

On Wed, Mar 8, 2017 at 5:30 PM, David McLaughlin <dmclaughlin@apache.org>
wrote:

> You don't have to store anything with my proposal. Preemption doesn't
> store anything either. The whole thing is it's just best-effort, and if the
> Scheduler restarts the worst that would happen is part of the current batch
> would have to go through the current Scheduling loop that users tolerate
> and deal with today.
>
>
>
> On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zmanji@apache.org> wrote:
>
>> David,
>>
>> I have two concerns with that idea. First, it would require persisting the
>> relationship of <Hostname, Resources> to <Task> for every task. I'm not
>> sure if adding more storage and storage operations is the ideal way of
>> solving this problem. Second, in a multi framework environment, a
>> framework
>> needs to use dynamic reservations otherwise the resources might be taken
>> by
>> another framework.
>>
>> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <dmclaughlin@apache.org>
>> wrote:
>>
>> > So I read the docs again and I have one major question - do we even need
>> > dynamic reservations for the current proposal?
>> >
>> > The current goal of the proposed work is to keep an offer on a host and
>> > prevent some other pending task from taking it before the next
>> scheduling
>> > round. This exact problem is solved in preemption and we could use a
>> > similar technique for reserving offers after killing tasks when going
>> > through the update loop. We wouldn't need to add tiers or
>> reconciliation or
>> > solve any of these other concerns. Reusing an offer skips so much of the
>> > expensive stuff in the Scheduler that it would be a no-brainer for the
>> > operator to turn it on for every single task in the cluster.
>> >
>> >
>> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sniemitz@apache.org>
>> wrote:
>> >
>> > > I read over the docs, it looks like a good start.  Personally I don't
>> see
>> > > much of a benefit for dynamically reserved cpu/mem, but I'm excited
>> about
>> > > the possibility of building off this for dynamically reserved
>> persistent
>> > > volumes.
>> > >
>> > > I would like to see more detail on how a reservation "times out", and
>> the
>> > > configuration options per job around that, as I feel like its the most
>> > > complicated part of all of this.  Ideally there would also be hooks
>> into
>> > > the host maintenance APIs here.
>> > >
>> > > I also didn't see any mention of it, but I believe mesos requires the
>> > > framework to reserve resources with a role.  By default aurora runs as
>> > the
>> > > special "*" role, does this mean aurora will need to have a role
>> > specified
>> > > now for this to work?  Or does mesos allow reserving resources
>> without a
>> > > role?
>> > >
>> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
>> > Stephan.Erb@blue-yonder.com>
>> > > wrote:
>> > >
>> > > > Hi everyone,
>> > > >
>> > > > There have been two documents on Dynamic Reservations as a first
>> step
>> > > > towards persistent services:
>> > > >
>> > > > ·         RFC: https://docs.google.com/document/d/
>> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
>> > hcsc8tda08vy
>> > > >
>> > > > ·         Technical Design Doc:  https://docs.google.com/docume
>> nt/d/
>> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
>> > klg3urfbnq3v
>> > > >
>> > > > Since a couple of days there are also now two patches online for a
>> MVP
>> > by
>> > > > Dmitriy:
>> > > >
>> > > > ·         https://reviews.apache.org/r/56690/
>> > > >
>> > > > ·         https://reviews.apache.org/r/56691/
>> > > >
>> > > > From reading the documents, I am under the impression that there is
>> a
>> > > > rough consensus on the following points:
>> > > >
>> > > > ·         We want dynamic reservations. Our general goal is to
>> enable
>> > the
>> > > > re-scheduling of tasks on the same host they used in a previous run.
>> > > >
>> > > > ·         Dynamic reservations are a best-effort feature. If in
>> doubt,
>> > a
>> > > > task will be scheduled somewhere else.
>> > > >
>> > > > ·         Jobs opt into reserved resources using an appropriate tier
>> > > > config.
>> > > >
>> > > > ·         The tier config in supposed to be neither preemptible nor
>> > > > revocable. Reserving resources therefore requires appropriate quota.
>> > > >
>> > > > ·         Aurora will tag reserved Mesos resources by adding the
>> unique
>> > > > instance key of the reserving task instance as a label. Only this
>> task
>> > > > instance will be allowed to use those tagged resources.
>> > > >
>> > > > I am unclear on the following general questions as there is
>> > contradicting
>> > > > content:
>> > > >
>> > > > a)       How does the user interact with reservations?  There are
>> > several
>> > > > proposals in the documents to auto-reserve on `aurora job create`
or
>> > > > `aurora cron schedule` and to automatically un-reserve on the
>> > appropriate
>> > > > reverse actions. But will we also allow a user further control over
>> the
>> > > > reservations so that they can manage those independent of the
>> task/job
>> > > > lifecycle? For example, how does Borg handle this?
>> > > >
>> > > > b)       The implementation proposal and patches include an
>> > > > OfferReconciler, so this implies we don’t want to offer any control
>> for
>> > > the
>> > > > user. The only control mechanism will be the cluster-wide offer wait
>> > time
>> > > > limiting the number of seconds unused reserved resources can linger
>> > > before
>> > > > they are un-reserved.
>> > > >
>> > > > c)       Will we allow adhoc/cron jobs to reserve resources? Does
it
>> > even
>> > > > matter if we don’t give control to users and just rely on the
>> > > > OfferReconciler?
>> > > >
>> > > >
>> > > > I have a couple of questions on the MVP and some implementation
>> > details.
>> > > I
>> > > > will follow up with those in a separate mail.
>> > > >
>> > > > Thanks and best regards,
>> > > > Stephan
>> > > >
>> > >
>> >
>> > --
>> > Zameer Manji
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message