aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora ReviewBot <wfar...@apache.org>
Subject Re: Review Request 63199: Refactor staticallyBannedOffers into a LRU cache
Date Sat, 21 Oct 2017 06:55:15 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63199/#review188890
-----------------------------------------------------------


Ship it!




Master (9825e05) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Oct. 21, 2017, 5:53 a.m., Jordan Ly wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63199/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2017, 5:53 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan Erb,
and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Using the new `hold_offers_forever` option, it is possible for the `staticallyBannedOffers`
to grow very large in size as we never release offers.
> 
> As an alternative to https://reviews.apache.org/r/63121/, I propose changing `staticallyBannedOffers`
into a LRU cache which expires entries after `min_offer_hold_time` + `offer_hold_jitter_window`
(referred to as `maxOfferHoldTime`), while also taking an option for a maximum size for the
cache. I believe that this approach has a couple of benefits:
> 
> 1. The current behavior of `staticallyBannedOffers` is (kinda) preserved. Entries will
no longer be removed when the offer is used, but they will be removed within `maxOfferHoldTime`.
This means cluster operators will not have to think about the new `offer_static_ban_cache_max_size`
if they aren't affected by the memory leak now.
> 2. Cluster operators that use Aurora as a single framework and hold offers indefinitely
can cap the size of the cache to avoid the memory leak.
> 3. Using an LRU cache greatly benefits quickly recurring crons and job updates.
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 5a9099bf9dd292249d72bc3a7604fbb3394f30ea

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 7011a4cc9eea827cdd54698aaed1a653774bce7f

>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java e060f2073dce4d2486d1ee2bfd873fe75167c6d0

>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java e6b2c55e4f33f9a603157236766425edcaff10e7

>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 5b502442163581daa4d7954b09c00bdc3680a726

>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 6c8434e9cfe46ef63ff10c6f059ecb99981f29a2

> 
> 
> Diff: https://reviews.apache.org/r/63199/diff/4/
> 
> 
> Testing
> -------
> 
> Unit tests pass.
> Deployed on a scale test cluster and saw that a) `staticallyBannedOffers` memory leak
fixed with correct options and b) lowered assignment time for quickly recurring crons and
rescheduled jobs.
> 
> 
> Thanks,
> 
> Jordan Ly
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message