aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Erb <s...@apache.org>
Subject Re: Review Request 62956: Immediately reject offers lacking necessary resources
Date Fri, 13 Oct 2017 08:12:36 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62956/#review187939
-----------------------------------------------------------




src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
Lines 67-68 (patched)
<https://reviews.apache.org/r/62956/#comment265006>

    As far as I know this will filter this agent entirely for 30 days. This comes pretty close
to leaking agents. https://github.com/apache/mesos/blob/2fe2bb26a425da9aaf1d7cf34019dd347d0cf9a4/src/master/allocator/mesos/hierarchical.cpp#L1207-L1209
    
    This implies the timeout would need to be significantly smaller (e.g ~3 minutes) and configurable
for operators. At that point, I am no longer sure the optimization would help at Twitter-scale
clusters.



src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
Lines 220-224 (patched)
<https://reviews.apache.org/r/62956/#comment265005>

    This won't work for us.
    
    We are using both non-revocable and revocable (CPU & RAM) resources. it is crucial
for us that we can still use revocable resources on an agent even if the non-revocable resources
are maxed out. The same applies vice versa. 
    
    This pseudo code should solve it:
    ```
    bool lacksUsefulResources(offer):
        no_revocable = revocable_mem <= mem_threshold || revocable_cpu <= cpu_threshold
        no_non_revocabe = mem <= mem_threshold || cpu <= cpu_threshold
        
        return no_revocable and no_non_revocable
    ```
    
    Would that still work for you? 
    
    (As a minor improvement of the heuristic we could use the minimal executor resources as
thresholds rather than 0)


- Stephan Erb


On Oct. 13, 2017, 1:18 a.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62956/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2017, 1:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> There's no reason for us to evaluate offers with no CPUs or memory, so reject them early
in the offer lifecycle.
> 
> This is an incremental performance optimization, but it may net significant improvements
based on observations in some very large clusters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/http/Utilization.java 3c77e2983ce00f897f3d5ed106b779cd7f7f0940

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java e8334310a2a46a0ccb09ee6e4122c515892d3996

>   src/main/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilter.java 1b1239753f40d7d46d91724def6c25037eb79f1c

>   src/main/java/org/apache/aurora/scheduler/resources/ResourceBag.java d5db81b88a0369d0b26c8fbf70efab3886ad7695

>   src/main/java/org/apache/aurora/scheduler/stats/TaskStatCalculator.java b98aaaf48ae60afef19a368ee96abc897300f8fa

>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 2cfdc090ff75a63111ae146c9fe7b3542e7ac83f

>   src/test/java/org/apache/aurora/scheduler/offers/Offers.java 129b4437315c6ad4ea47ca75d4ae6e28cadd7911

>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java 765a527acb96997989c920be8b69dfa1113dc302

> 
> 
> Diff: https://reviews.apache.org/r/62956/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bill Farner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message