aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zameer Manji <zma...@apache.org>
Subject Re: Review Request 57717: Support Mesos Maintenance
Date Fri, 17 Mar 2017 21:48:55 GMT


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java
> > Lines 135 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666630#file1666630line135>
> >
> >     General question: Is this value only used internally or also shown somewhere?

It's shown in the UI sometimes if there are no other vetos.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
> > Lines 326-330 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666632#file1666632line327>
> >
> >     In our clusters we set the offer filter to 0, so we have to be thoughtful with
how we use the option here. 
> >     
> >     If we ever decide to use this filter for declining an offer, we would build
an andless loop where an offer will bounce between Mesos and Aurora constantly.
> >     
> >     It is OK now, as the filter does not seem to do anything if we accept in offer
(?). However, maybe we should just use the default filter for now to prevent potential accidents
in the future.

Agreed. Changed to using the default instance.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java
> > Lines 333-334 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666632#file1666632line334>
> >
> >     We run the `handleOffers` code in a callback. Should we move the portion of
the `handleInverseOffer` code that relies on the storage lock (i.e. `drainForInverseOffer`)
to a callback as well?

Good catch. For simplicity, I put all of the code in a callback.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java
> > Lines 140-142 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666633#file1666633line140>
> >
> >     I suppose we never call this method on the (legacy) SchedulerDriver?

Yes, the only accepting of an inverse offer is triggered from the callback handler. That code
can only be triggered when we get an inverse offer which is not possible with the scheduler
driver.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
> > Lines 281-291 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666636#file1666636line282>
> >
> >     I am not sure if I get this one. My first assumption was we could write something
like:
> >     
> >     ```
> >     Ordering.
> >       .natural()
> >       .reverse()
> >       .onResultOf((Function<HostOffer, Instant>)
> >           o -> o.getUnavailabilityStart().or(Instant.MAX)));
> >     
> >     ```
> >     
> >     Or would this miss some cases covered by the code in the patch?

Good catch. This is left over code from a previous refactor.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java
> > Lines 79-80 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666638#file1666638line79>
> >
> >     Why do we need the offer filter duration here?
> >     
> >     In any case, we also need `offer_hold_jitter_window` to be included.

Typo. this should be the offer_hold_jitter_window.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java
> > Lines 83 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666638#file1666638line83>
> >
> >     an -> and

Done.


> On March 17, 2017, 7:45 a.m., Stephan Erb wrote:
> > src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> > Lines 227-236 (patched)
> > <https://reviews.apache.org/r/57717/diff/1/?file=1666650#file1666650line227>
> >
> >     The test using this helper will be waiting for over 2.5min. Would be great if
you replace the sleeps with proper "wait until loops" in `assert_task_status` to gain some
performance and robustness. For an example, see `test_observer_ui`.

Done.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57717/#review169264
-----------------------------------------------------------


On March 16, 2017, 6:25 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57717/
> -----------------------------------------------------------
> 
> (Updated March 16, 2017, 6:25 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Stephan Erb.
> 
> 
> Bugs: AURORA-1904
>     https://issues.apache.org/jira/browse/AURORA-1904
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This adds support for Mesos Maintenance per the design doc[1].
> 
> Per the design the scheduler gains another parameter, `unavailability_threshold`. With
this threshold the scheduler does the following:
> 
> 1. Accept all inverse offers from Mesos.
> 2. Drain when accepting an inverse offer if the unavailability starts within the thereshold.
> 3. Veto any offers with unavailability starting within the threshold.
> 4. Penalize offers that have unavailablity information
> 
> For readability and safety the time based code uses the new `java.time` package in Java
8, primarily relying on the `Instant` class.
> 
> [1]: https://docs.google.com/document/d/1Z7dFAm6I1nrBE9S5WHw0D0LApBumkIbHrk0-ceoD2YI/edit#heading=h.n5tvzjaj9llx
> 
> 
> Diffs
> -----
> 
>   commons/src/main/java/org/apache/aurora/common/util/Clock.java 5c4ced1ffe7827c0e529d17cb51db42fd1b762ff

>   commons/src/main/java/org/apache/aurora/common/util/testing/FakeClock.java 104f2c64196da16d68a85e365f1dc762547e1e36

>   examples/vagrant/upstart/aurora-scheduler.conf 31fa0368435a179698d1a745331a85430049762e

>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeDriver.java 45f59c0bd09f81916c95345233e6642b4cf81830

>   src/main/java/org/apache/aurora/scheduler/HostOffer.java 23f0600d64e1e15f4856f397e839e3d1c87f3b96

>   src/main/java/org/apache/aurora/scheduler/base/Conversions.java 8295216dc651eff357c4f3c51c8a53052244c6bf

>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java bb1a960a4c77f48b0ceaa213bd27546551f384f9

>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 60097d91d836e2686d6e90571f13a2fbfd88ae14

>   src/main/java/org/apache/aurora/scheduler/mesos/Driver.java 71547ce931e0161adfc5de43f367b3ec43aa17e8

>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 801551bce7879989d93d2d32a8fe28a891312c73

>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java f65a29d7ad8bc49784e324e674f30a6728a9d4ae

>   src/main/java/org/apache/aurora/scheduler/mesos/VersionedMesosSchedulerImpl.java 84e3f47636d95521600e9a4c4d5b8bc8bbbff8cf

>   src/main/java/org/apache/aurora/scheduler/mesos/VersionedSchedulerDriverService.java
d928d02cab087991a8cd8896d4366f6e5dca0913 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 8c000cb0626bd34f6f30e23fe2b3a045f2b44e35

>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java e16e36ed360ef9ca371df9084365ea88cfb6e7ce

>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 202cae96ffc5b49e638b973a273f7983137b5baf

>   src/main/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilter.java ba49e7a4ccfaddbd85218018b0bbad5efab41d99

>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 574efc9e44a21fc7cdc0d316d6c51f47cd673ce3

>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java da378e84ee65a658ff2382489d3ab6d5f6451b5f

>   src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 1d7f9f45e7a65838e2c826b4b21a31c7944eab19

>   src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 80f631e9024e266fe823d845193b19c1d559a5ef

>   src/test/java/org/apache/aurora/scheduler/mesos/VersionedSchedulerDriverServiceTest.java
72aede85829f087bc88760e8b564d25aceb8aed8 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 49d4e82cc03144b80292fe43066a6cc4d7aed88f

>   src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
02bfc51a7cba1116334dbfe30e0abe05ba3fbb4a 
>   src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
ae83dea05e10ebab0c0b07d60386d0faf78fb7e9 
>   src/test/sh/org/apache/aurora/e2e/generate_mesos_maintenance_schedule.py PRE-CREATION

>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora de8179228d9359900eadf4084355ea257bea45ba

>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 80b4c54774a02fdc2ee0e36d26f81aedd2e0055e

> 
> 
> Diff: https://reviews.apache.org/r/57717/diff/1/
> 
> 
> Testing
> -------
> 
> e2e tests
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message