aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxim Khutornenko" <ma...@apache.org>
Subject Re: Review Request 31739: Making task preemption asynchronous.
Date Wed, 04 Mar 2015 23:10:26 GMT


> On March 4, 2015, 8:17 p.m., Bill Farner wrote:
> > Is there a reason you did not opt to implement this behind the `Preemptor` interface?
 Seems like if you went with that approach, `TaskScheduler` can be oblivious to the background
operations.
> 
> Maxim Khutornenko wrote:
>     Trying to keep things simple. Moving it behind `Preemptor` would require sharing
`Reservations` (or some equivalent feedback notificaiton) between TaskScheduler and Preemptor.
> 
> Bill Farner wrote:
>     I don't see why the data structure would need to be shared.  On one call you could
asynchronously kick off the work, and a subsequent call could report back the result of the
previous.
> 
> Maxim Khutornenko wrote:
>     Perhaps I am missing the point but how does it correlate with "TaskScheduler can
be oblivious to the background operations"? If there is no immediate response back from the
preemptor what is responsible for getting the reservation data and when? Where does that reservation
data live in this case?

Chatted with Bill offline and we agreed that we should start moving towards a standalone (background
worker) preemptor. That would require moving async decision making into the preemptor itself.
I am going to discard this RB and start working towards what benefits us longer term.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31739/#review75226
-----------------------------------------------------------


On March 4, 2015, 7:30 p.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31739/
> -----------------------------------------------------------
> 
> (Updated March 4, 2015, 7:30 p.m.)
> 
> 
> Review request for Aurora, Bill Farner and Zameer Manji.
> 
> 
> Bugs: AURORA-1158
>     https://issues.apache.org/jira/browse/AURORA-1158
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Reservations now happen asynchronously with a configurable delay between a failed task
scheduling and a preemption attempt.
> 
> Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the
main scheduling loop and thus unreachable by earlier benchmarks.
> 
> Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark
(now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static
veto offer filtering it's now 99.995% faster :) 
> 
> The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type
and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain
about.
> 
> Before:
> Benchmark                                                                     Mode  Cnt
        Score        Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark    avgt  100
   781243.004 ±   9308.450  ns/op
> SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark  avgt  100
  1205278.826 ±  19800.452  ns/op
> SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark   avgt  100
 77048458.974 ± 918593.702  ns/op
> SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark  avgt  100
   769919.326 ±  18963.264  ns/op
> 
> 
> After:
> Benchmark                                                                     Mode  Cnt
       Score        Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark    avgt  100
   28117.603 ±    243.556  ns/op
> SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark  avgt  100
  348667.808 ±   2956.521  ns/op
> SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark             avgt  100
    3978.828 ±    351.186  ns/op
> SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark  avgt  100
   26096.782 ±    412.138  ns/op
> SchedulingBenchmarks.PreemptorBenchmark.runBenchmark                          avgt  100
 6054216.773 ± 105428.318  ns/op
> 
> Perf gain summary:
> InsufficientResourcesSchedulingBenchmark     - 96.4%
> LimitConstraintMismatchSchedulingBenchmark   - 71%
> PreemptorFallbackForLargeClusterBenchmark    - 99.995%
> ValueConstraintMismatchSchedulingBenchmark   - 96.6%
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee

>   src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d

>   src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958

>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d

>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422

> 
> Diff: https://reviews.apache.org/r/31739/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew -Pq build
> Manual testing in vagrant.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message