aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Ly <jordan....@gmail.com>
Subject Re: Review Request 62956: Immediately reject offers lacking necessary resources
Date Tue, 17 Oct 2017 22:51:27 GMT


> On Oct. 17, 2017, 6:26 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
> > Lines 67-68 (patched)
> > <https://reviews.apache.org/r/62956/diff/2/?file=1854107#file1854107line67>
> >
> >     Since the Scheduler fails over every 24 hours, maybe we can let the new Scheduler
retry the Slave?
> >     
> >     30 days seems like a very high threshold and can sneak into a tight capacity
situation without much warning. Typically in those scenarios, we manually churn the cluster
to free up space. Wonder how the 30 day filter would behave in such a case. Having said that,
we should make this configurable with a resonable default (few hrs)?
> 
> Jordan Ly wrote:
>     I believe that the filter only works against a specific framework ID, so that a scheduler
failover or deploy would receive the offers again.
> 
> Jordan Ly wrote:
>     Additionally, does churning the cluster mean new offers would be generated? If so,
I think that they would get a new offer ID and be reissued.
> 
> Stephan Erb wrote:
>     The framework ID remains constent across failovers of both Aurora schedulers and
Mesos masters. Otherwise we'd lose all currently runnings tasks during a failover.
>     
>     For the filtering I am under the impression that it is per agent and independent
of offer or offer IDs. To be safe, we should check with some Mesos developers though :)

You are correct, the framework ID will remain constant and the filters will stay in place.

For the filtering, I am being told that if you refuse an offer with x resources, then if those
resources stay the same Mesos will not offer them to you again. However, if the resources
increases then Mesos will offer them to the framework again.

Could we take advantage of the reviveOffers() call to remove filters on scheduler initialization?


- Jordan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62956/#review188356
-----------------------------------------------------------


On Oct. 12, 2017, 11:18 p.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62956/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2017, 11:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> There's no reason for us to evaluate offers with no CPUs or memory, so reject them early
in the offer lifecycle.
> 
> This is an incremental performance optimization, but it may net significant improvements
based on observations in some very large clusters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/http/Utilization.java 3c77e2983ce00f897f3d5ed106b779cd7f7f0940

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java e8334310a2a46a0ccb09ee6e4122c515892d3996

>   src/main/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilter.java 1b1239753f40d7d46d91724def6c25037eb79f1c

>   src/main/java/org/apache/aurora/scheduler/resources/ResourceBag.java d5db81b88a0369d0b26c8fbf70efab3886ad7695

>   src/main/java/org/apache/aurora/scheduler/stats/TaskStatCalculator.java b98aaaf48ae60afef19a368ee96abc897300f8fa

>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 2cfdc090ff75a63111ae146c9fe7b3542e7ac83f

>   src/test/java/org/apache/aurora/scheduler/offers/Offers.java 129b4437315c6ad4ea47ca75d4ae6e28cadd7911

>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java 765a527acb96997989c920be8b69dfa1113dc302

> 
> 
> Diff: https://reviews.apache.org/r/62956/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bill Farner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message