aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McLaughlin <da...@dmclaughlin.com>
Subject Re: Review Request 57487: Implementation of Dynamic Reservations Proposal
Date Tue, 04 Apr 2017 23:07:29 GMT


> On March 30, 2017, 11:56 p.m., David McLaughlin wrote:
> > The motivation for this is a performance optimization (less Scheduling loop overhead
+ cache locality on the target host). So why should that decision be encoded in the service
tier? We'd want every single task using this and wouldn't want users even knowing about it.
And we still want to have the preferred vs preemptible distinction. 
> > 
> > Currently a task restart is a powerful tool to undo a bad scheduling round or for
whatever reason to get off a host - e.g. to get away from a noisy neighbor or a machine that's
close to falling over. If I'm reading this patch correctly, they lose this ability after this
change? Or at least the change is now - kill the task, wait for some operator defined timeout
and then schedule it again with the original config. 
> > 
> > What happens when we want to extend the use of Dynamic Reservations and give users
control over when they are collected. What tier would we use then? How would reserved offers
be collected? It seems like this implementation is not future proof at all.
> 
> Dmitriy Shirchenko wrote:
>     David, thanks for your comment. I would add the following to performance optimization
as improvements and features that this patch will offer:
>     
>     * Consistent MTTA for any size when upgrading, irrespective of cluster capacity and
demand, assuming an upgrade does not increase the resource vector (sizing down is OK).
>     * Shorter MTTR for tasks using Docker or unified containerizer, reserved tasks will
get consistent placement of each task on the same host, resulting in less work for the Mesos
or Docker fetcher as host’s warm cache can be leveraged and previous image layer already
exist on each host.
>     * After a job is placed on each host, task failures cannot be in a PENDING state
transition as we guarantee resource availability.
>     * This implementation lays foundation for support of persistent volumes in Aurora.
>     
>     The way tier is added, you absolutely can make a reserved job preemptible. All you
would do is specify a new tier definition in tiers.json and set both 'reserved' and 'preemptable'
to `True`. 
>     
>     About restarts, you bring up a good point. I would like to add that if a task does
not have a `reserved` set to True inside `TierInfo` then nothing changes and restarts proceed
by rescheduling the task onto different hosts. However, if a task will want reserved resources,
it implies to us that they want "stickiness" so the task would be scheduled on the same host.
I feel like that contradicts use case of trying to get away from noisy neighbors and yes,
the story is not great for this case. We can brainstorm on possible solutions for this use
case. If this would be an immediately required feature, we can add an `unreserve` operation
to any offer that comes back from a reserved task before rescheduling. How does that sound?

>     
>     Would you elaborate what you are referring to about control over dynamically reserved
resources? Do we currently give users a control beyond host constraints at the moment? Currently
reserved offers are not collected but with @serb's nice suggestion are simply expired if offer
is unused. To collect them, we can bring back the `OfferReconciler` if the complexity warrants
it.

Well the main point about preemptable is that now users have to opt into the reserved tier
and now we need to set up tiers for each combination of parameters (side note: I'm not a huge
fan of this static tiers concept, seems broken to me). Currently there is no easy way to automatically
migrate running tasks from one tier to another, nor to force users to update to a certain
tier without making a custom client. In fact because of the DSL it's not even possible to
upgrade all of the job configs to this new tier either. 

Why is this a problem? Well from our use case at Twitter, we want this feature in order to
run our clusters close to full capacity. Currently Aurora is not very good when there is very
little headroom. The fact that preemptible or revocable tasks can get launched on an offer
vacated by a production task as part of an update is problematic. It can force the prod job
to go through the preemption loop to get the same slot back and adds hours to a production
job's deploy time (and the churn created severely impacts those trying to run test jobs).
Because this is a performance optimization, we do not want users to opt in to this feature,
and instead have it applied to every service without users even knowing. 


My point about control over dynamically reserved resources relates to your last point that
this lays a foundation for persistent storage. Right now we are using the reserved tier and
we automatically reclaim the resources with a timer. In the case of persistent storage the
decision to reclaim a reservation - even if the host is down for hours - should be totally
in the hands of the service owner. This is at least what we would need to meet the use cases
we have here. So what you'll end up needing is another tier "reallyReserved" or something
else to be able to disable the automatic unreserving of resources (this is true with the current
mechanism in this patch, or with the old reconciliation logic you had). Having these two reserved
tiers would be confusing for users. 


I'm really not convinced we want to use dynamic reservations for this problem. As I said on
the dev list, I think doing this with the same mechanisms as preemption with some shared state
between JobUpdateController and TaskAssigner is a cleaner solution that leaves the reserved
tier open for when we actually need it. At the absolute minimum, if we use DR we need to hide
this fact from the user.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57487/#review170659
-----------------------------------------------------------


On March 31, 2017, 8:52 p.m., Dmitriy Shirchenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57487/
> -----------------------------------------------------------
> 
> (Updated March 31, 2017, 8:52 p.m.)
> 
> 
> Review request for Aurora, Mehrdad Nurolahzade, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Esteemed reviewers, here is the latest iteration on the implementation of dynamic reservations.
Some changes are merging of the patches into a single one, updated design document with a
more high level overview and user stories told from an operator’s point of view. Unit TESTS
are going to be done as soon as we agree on the approach, as I have tested this patch on local
vagrant and a multi-node dev cluster. Jenkins build is expected to fail as tested are incomplete.
> 
> For reference, here are previous two patches which feedback I addressed in this new single
patch. 
> Previous 2 patches:
> https://reviews.apache.org/r/56690/
> https://reviews.apache.org/r/56691/
> 
> RFC document: https://docs.google.com/document/d/15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A
> Design Doc [UPDATED]: https://docs.google.com/document/d/1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java f2296a9d7a88be7e43124370edecfe64415df00f

>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeOfferManager.java 6f2ca35c5d83dde29c24865b4826d4932e96da80

>   src/main/java/org/apache/aurora/scheduler/HostOffer.java bc40d0798f40003cab5bf6efe607217e4d5de9f1

>   src/main/java/org/apache/aurora/scheduler/TaskVars.java 676dfd9f9d7ee0633c05424f788fd0ab116976bb

>   src/main/java/org/apache/aurora/scheduler/TierInfo.java c45b949ae7946fc92d7e62f94696ddc4f0790cfa

>   src/main/java/org/apache/aurora/scheduler/TierManager.java c6ad2b1c48673ca2c14ddd308684d81ce536beca

>   src/main/java/org/apache/aurora/scheduler/base/InstanceKeys.java b12ac83168401c15fb1d30179ea8e4816f09cd3d

>   src/main/java/org/apache/aurora/scheduler/base/TaskTestUtil.java f0b148cd158d61cd89cc51dca9f3fa4c6feb1b49

>   src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java ad6b3efb69d71e8915044abafacec85f8c9efc59

>   src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java f6c759f03c4152ae93317692fc9db202fe251122

>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 36608a9f027c95723c31f9915852112beb367223

>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java df51d4cf4893899613683603ab4aa9aefa88faa6

>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 0d639f66db456858278b0485c91c40975c3b45ac

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 78255e6dfa31c4920afc0221ee60ec4f8c2a12c4

>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java adf7f33e4a72d87c3624f84dfe4998e20dc75fdc

>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 317a2d26d8bfa27988c60a7706b9fb3aa9b4e2a2

>   src/main/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilter.java 5ed578cc4c11b49f607db5f7e516d9e6022a926c

>   src/main/java/org/apache/aurora/scheduler/resources/AcceptedOffer.java 291d5c95916915afc48a7143759e523fccd52feb

>   src/main/java/org/apache/aurora/scheduler/resources/MesosResourceConverter.java 7040004ae48d3a9d0985cb9b231f914ebf6ff5a4

>   src/main/java/org/apache/aurora/scheduler/resources/ResourceManager.java 9aa263a9cfae03a9a0c5bc7fe3a1405397d3009c

>   src/main/java/org/apache/aurora/scheduler/scheduling/ReservationTimeoutCalculator.java
PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 03a0e8485d1a392f107fda5b4af05b7f8f6067c6

>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 203f62bacc47470545d095e4d25f7e0f25990ed9

>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java a177b301203143539b052524d14043ec8a85a46d

>   src/main/java/org/apache/aurora/scheduler/stats/AsyncStatsModule.java 40451e91aed45866c2030d901160cc4e084834df

>   src/main/resources/org/apache/aurora/scheduler/tiers.json 34ddb1dc769a73115c209c9b2ee158cd364392d8

>   src/test/java/org/apache/aurora/scheduler/TierManagerTest.java 82e40d509d84c37a19b6a9ef942283d908833840

>   src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java
d6904f844df3880fb699948b3a7fd457c9e81ed0 
>   src/test/java/org/apache/aurora/scheduler/http/OffersTest.java 30699596a1c95199df7504f62c5c18cab1be1c6c

>   src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 93cc34cf8393f969087cd0fd6f577228c00170e9

>   src/test/java/org/apache/aurora/scheduler/offers/HostOffers.java PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java d7addc0effb60c196cf339081ad81de541d05385

>   src/test/java/org/apache/aurora/scheduler/resources/AcceptedOfferTest.java dded9c34749cf599d197ed312ffb6bf63b6033f1

>   src/test/java/org/apache/aurora/scheduler/resources/ResourceManagerTest.java b8b8edb1a21ba89b8b60f8f8451c8c776fc23ae8

>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java e04f6113c43eca4555ee0719f8208d7c4ebb8d61

>   src/test/java/org/apache/aurora/scheduler/scheduling/ReservationTimeoutCalculatorTest.java
PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java fa1a81785802b82542030e1aae786fe9570d9827

>   src/test/java/org/apache/aurora/scheduler/sla/SlaTestUtil.java 78f440f7546de9ed6842cb51db02b3bddc9a74ff

>   src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java cf2d25ec2e407df7159e0021ddb44adf937e1777

> 
> 
> Diff: https://reviews.apache.org/r/57487/diff/5/
> 
> 
> Testing
> -------
> 
> Tested on local vagrant for following scenarios:
> Reserving a task
> Making sure returned offer comes back
> Making sure offer is unreserved
> 
> 
> Thanks,
> 
> Dmitriy Shirchenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message