aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Mahler" <benjamin.mah...@gmail.com>
Subject Re: Review Request 33689: Updated scheduler to process status updates asynchronously in batches.
Date Mon, 11 May 2015 18:55:40 GMT


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java, line 191
> > <https://reviews.apache.org/r/33689/diff/3/?file=951768#file951768line191>
> >
> >     Delete TODO.
> 
> Zameer Manji wrote:
>     +1

Done.


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, line 60
> > <https://reviews.apache.org/r/33689/diff/3/?file=951770#file951770line60>
> >
> >     @Positive

Done.


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, lines 66-67
> > <https://reviews.apache.org/r/33689/diff/3/?file=951770#file951770line66>
> >
> >     Suggest rephrasing adding a notion that this is the argument controlling interruption
rate of otherwise blocking wait calls.

Re-phrased, let me know if you still think it's unclear!


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, line 106
> > <https://reviews.apache.org/r/33689/diff/3/?file=951770#file951770line106>
> >
> >     Drop, there is already a binding like that above.

Done, moved that one down to keep all the UserTaskLauncher bindings in the same place.


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/UserTaskLauncher.java, line 171
> > <https://reviews.apache.org/r/33689/diff/3/?file=951771#file951771line171>
> >
> >     Just use "maxBatchSize - updates.size()" to avoid second guessing "-1" origin.

I like it, great suggestion! :)


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java, line 84
> > <https://reviews.apache.org/r/33689/diff/3/?file=951776#file951776line84>
> >
> >     Use 0L instead to avoid any delays in unit tests.

Switched to 1ms, 0ms will make it a busy loop, are you ok with that?


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java, line 111
> > <https://reviews.apache.org/r/33689/diff/3/?file=951776#file951776line111>
> >
> >     .once() is redundant

Removed.


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java, line 124
> > <https://reviews.apache.org/r/33689/diff/3/?file=951776#file951776line124>
> >
> >     s/Object/Void

Done


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java, line 141
> > <https://reviews.apache.org/r/33689/diff/3/?file=951776#file951776line141>
> >
> >     Suggest using 5 seconds as it's more than enough time to run a single update.

Ok, 5 sounds fine as well.


> On May 7, 2015, 10:24 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, lines 99-104
> > <https://reviews.apache.org/r/33689/diff/3/?file=951770#file951770line99>
> >
> >     Suggest wrapping these into a UserTaskLauncherSettings as a more scalable/lower
overhead solution. See example here: https://github.com/apache/aurora/blob/ef0975655c04f0c2f3ecb6599d4e4beb9547f091/src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java#L259

Done!


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33689/#review82916
-----------------------------------------------------------


On May 7, 2015, 12:27 a.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33689/
> -----------------------------------------------------------
> 
> (Updated May 7, 2015, 12:27 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1228
>     https://issues.apache.org/jira/browse/AURORA-1228
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Now the processing of status updates is done asynchronously with batching to insulate
throughput from the expensive storage resource. Updates are placed into a queue and consumed
by another thread. If many updates arrive while we're storing a batch of updates, these will
be processed together in batch rather than individually.
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 7bb64dd913f0fe2fede95d50a061043dbb794ab4

>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeDriver.java 45de15a57baf7a2f7d437b590935714e28777f35

>   src/main/java/org/apache/aurora/scheduler/SchedulerModule.java d3ac176e9402a33fd2074b0737313458120da9e2

>   src/main/java/org/apache/aurora/scheduler/UserTaskLauncher.java 0ce9c9d4cf75f9add260f285115b1d60786ded57

>   src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java 4d589a33a2933b0cb6caf85abfae45c5e635c3ce

>   src/main/java/org/apache/aurora/scheduler/mesos/Driver.java c7e45a89ceaa2c310feb610091eec0b04187860e

>   src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java 9b8ab7c1027731f9d3f6cae77b85272ea63354d4

>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java da2d5df2e053e6e1b8fb08d6813dff9eac9777f8

>   src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java 32432322753799562d671db39c0d7fa308d962ff

>   src/test/java/org/apache/aurora/scheduler/async/GcExecutorLauncherTest.java 422d5a9a42310979752eb7282658316c2b772419

>   src/test/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImplTest.java abdeee49858fc439c27911c4eb544bf8e8c931d4

> 
> Diff: https://reviews.apache.org/r/33689/diff/
> 
> 
> Testing
> -------
> 
> Ran the benchmark to confirm that this improves status update throughput substantially:
> 
> Before: Around 100 updates per second for a 5ms storage latency. Much worse for higher
latencies.
> After:  Around 4k-5k updates per second for a 5ms storage latency, down to 3k updates
per second for 100ms storage latency.
> 
> Updated unit tests for the new invariants:
> 
> * TaskLaunchers are responsible for acknowledging updates.
> * UserTaskLauncher processes updates asynchronously.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message