aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Erb" <step...@dev.static-void.de>
Subject Re: Review Request 33689: Updated scheduler to process status updates asynchronously in batches.
Date Sun, 10 May 2015 11:44:50 GMT


> On May 3, 2015, 4:01 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/UserTaskLauncher.java, line 70
> > <https://reviews.apache.org/r/33689/diff/1/?file=947578#file947578line70>
> >
> >     Any reason why you have chosen this particular value?
> 
> Ben Mahler wrote:
>     Are you asking why a maximum, or why this particular maximum? The idea behind the
maximum is to bound the amount of work that we do in the critical section (since it's a shared
resource, we're not the only ones using storage), but behind this particular value, no reason.
I've made it configurable via a flag.

Yeah, I somehow wondered why you chose 1000. But thinking about it, it sounds like a very
sane default.


> On May 3, 2015, 4:01 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/UserTaskLauncher.java, line 136
> > <https://reviews.apache.org/r/33689/diff/1/?file=947578#file947578line136>
> >
> >     Could be a loop-invariant ArrayList which is cleared at the beginning of each
interation.
> >     
> >     The tiny performance improvement is probably not measurable at all. However,
using a LinkedList here seems rather uncommon.
> 
> Ben Mahler wrote:
>     Hm.. ArrayList does not implement Queue, do you mean ArrayDeque? As you said, it's
likely not important, but what is important here is that we make it clear that updates is
FIFO (this is why Queue is used).

Ok, sounds reasonable.


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33689/#review82343
-----------------------------------------------------------


On May 7, 2015, 2:27 a.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33689/
> -----------------------------------------------------------
> 
> (Updated May 7, 2015, 2:27 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1228
>     https://issues.apache.org/jira/browse/AURORA-1228
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Now the processing of status updates is done asynchronously with batching to insulate
throughput from the expensive storage resource. Updates are placed into a queue and consumed
by another thread. If many updates arrive while we're storing a batch of updates, these will
be processed together in batch rather than individually.
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 7bb64dd913f0fe2fede95d50a061043dbb794ab4

>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeDriver.java 45de15a57baf7a2f7d437b590935714e28777f35

>   src/main/java/org/apache/aurora/scheduler/SchedulerModule.java d3ac176e9402a33fd2074b0737313458120da9e2

>   src/main/java/org/apache/aurora/scheduler/UserTaskLauncher.java 0ce9c9d4cf75f9add260f285115b1d60786ded57

>   src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java 4d589a33a2933b0cb6caf85abfae45c5e635c3ce

>   src/main/java/org/apache/aurora/scheduler/mesos/Driver.java c7e45a89ceaa2c310feb610091eec0b04187860e

>   src/main/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImpl.java 9b8ab7c1027731f9d3f6cae77b85272ea63354d4

>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java da2d5df2e053e6e1b8fb08d6813dff9eac9777f8

>   src/test/java/org/apache/aurora/scheduler/UserTaskLauncherTest.java 32432322753799562d671db39c0d7fa308d962ff

>   src/test/java/org/apache/aurora/scheduler/async/GcExecutorLauncherTest.java 422d5a9a42310979752eb7282658316c2b772419

>   src/test/java/org/apache/aurora/scheduler/mesos/MesosSchedulerImplTest.java abdeee49858fc439c27911c4eb544bf8e8c931d4

> 
> Diff: https://reviews.apache.org/r/33689/diff/
> 
> 
> Testing
> -------
> 
> Ran the benchmark to confirm that this improves status update throughput substantially:
> 
> Before: Around 100 updates per second for a 5ms storage latency. Much worse for higher
latencies.
> After:  Around 4k-5k updates per second for a 5ms storage latency, down to 3k updates
per second for 100ms storage latency.
> 
> Updated unit tests for the new invariants:
> 
> * TaskLaunchers are responsible for acknowledging updates.
> * UserTaskLauncher processes updates asynchronously.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message