aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McLaughlin <da...@dmclaughlin.com>
Subject Re: Review Request 58259: Add update affinity to Scheduler
Date Thu, 04 May 2017 16:21:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58259/
-----------------------------------------------------------

(Updated May 4, 2017, 4:21 p.m.)


Review request for Aurora, Santhosh Kumar Shanmugham, Stephan Erb, and Zameer Manji.


Changes
-------

Feedback. 

Also renamed misleading state names.


Repository: aurora


Description
-------

In the Dynamic Reservations review (and on the mailing list), I mentioned that we could implement
update affinity with less complexity using the same technique as preemption. Here is how that
would work. 

This just adds a simple wrapper around the preemptor's BiCache structure and then optimistically
tries to keep an agent free for a task during the update process. 


Note: I don't bother even checking the resources before reserving the agent. I figure there
is a chance the agent has enough room, and if not we'll catch it when we attempt to veto the
offer. We need to always check the offer like this anyway in case constraints change. In the
worst case it adds some delay in the rare cases you increase resources. 

We also don't persist the reservations, so if the Scheduler fails over during an update, the
worst case is that any instances between the KILLED and ASSIGNED in-flight batch need to fall
back to the current first-fit scheduling algorithm.


Diffs (updated)
-----

  RELEASE-NOTES.md 1ee0d0167f0d212fa028f8ed0ddf20849703685c 
  docs/operations/configuration.md f0581eade7296494d744b944ee8fddb94a5abaf6 
  src/main/java/org/apache/aurora/scheduler/base/TaskTestUtil.java f0b148cd158d61cd89cc51dca9f3fa4c6feb1b49

  src/main/java/org/apache/aurora/scheduler/resources/ResourceBag.java 211f1fc4ce46b853124f63d8ab8e37ac2f3d2d92

  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 203f62bacc47470545d095e4d25f7e0f25990ed9

  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java a177b301203143539b052524d14043ec8a85a46d

  src/main/java/org/apache/aurora/scheduler/updater/InstanceAction.java b4cd01b3e03029157d5ca5d1d8e79f01296b57c2

  src/main/java/org/apache/aurora/scheduler/updater/InstanceActionHandler.java f25dc0c6d9c05833b9938b023669c9c36a489f68

  src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java c129896d8cd54abd2634e2a339c27921042b0162

  src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java e14112479807b4477b82554caf84fe733f62cf58

  src/main/java/org/apache/aurora/scheduler/updater/StateEvaluator.java c95943d242dc2f539778bdc9e071f342005e8de3

  src/main/java/org/apache/aurora/scheduler/updater/UpdateAgentReserver.java PRE-CREATION

  src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java 13cbdadad606d9acaadc541320b22b0ae538cc5e

  src/test/java/org/apache/aurora/scheduler/resources/ResourceBagTest.java c1638265373e8f4cb508c08d5a62e07027eff9c8

  src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java fa1a81785802b82542030e1aae786fe9570d9827

  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java cf2d25ec2e407df7159e0021ddb44adf937e1777

  src/test/java/org/apache/aurora/scheduler/updater/AddTaskTest.java b2c4c66850dd8f35e06a631809530faa3b776252

  src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java df1f8394b824dbb7b2745fcccdab5adaafdf6e6c

  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 30b44f88a5b8477e917da21d92361aea1a39ceeb

  src/test/java/org/apache/aurora/scheduler/updater/KillTaskTest.java 833fd62c870f96b96343ee5e0eed0d439536381f

  src/test/java/org/apache/aurora/scheduler/updater/NullAgentReserverTest.java PRE-CREATION

  src/test/java/org/apache/aurora/scheduler/updater/UpdateAgentReserverImplTest.java PRE-CREATION



Diff: https://reviews.apache.org/r/58259/diff/6/

Changes: https://reviews.apache.org/r/58259/diff/5-6/


Testing
-------

./gradlew build
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


File Attachments
----------------

Cache utilization over time
  https://reviews.apache.org/media/uploaded/files/2017/04/25/7b41bd2b-4151-482c-9de2-9dee67c34133__declining-cache-hits.png
Offer rate from Mesos over time
  https://reviews.apache.org/media/uploaded/files/2017/04/25/b107d964-ee7d-435a-a3d9-2b54f6eac3fa__consistent-offer-rate.png
Async task workload (scaled) correlation with degraded cache utilization
  https://reviews.apache.org/media/uploaded/files/2017/04/25/7eaf37ac-fbf3-40eb-b3f6-90e914a3936f__async-task-correlation.png
cache hit rate before and after scheduler tuning
  https://reviews.apache.org/media/uploaded/files/2017/05/02/39998e8d-2a75-4f5d-bfc0-bb93011407af__Screen_Shot_2017-05-01_at_6.30.18_PM.png
JobUpdateControllerImpl bottleneck
  https://reviews.apache.org/media/uploaded/files/2017/05/02/f93484bd-c99e-4c01-9f8a-f0ad867adb26__Screen_Shot_2017-05-02_at_3.33.39_PM.png


Thanks,

David McLaughlin


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message