aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Ly <jordan....@gmail.com>
Subject Re: Review Request 63121: Remove static bans for task groups that are no longer pending
Date Sat, 21 Oct 2017 06:04:54 GMT


> On Oct. 20, 2017, 3:16 p.m., Bill Farner wrote:
> > Capturing some offline analysis/discussion - under certain conditions this patch
might do more harm than good.  In clusters with very high churn rates (e.g. services being
rescheduled frequently, high cron volume), static bans that outlive scheduling rounds can
prevent a significant amount of redundant scheduling work.  Jordan is experimenting with using
an LRU cache for static bans instead, which would allow us to mitigate the memory leak while
still avoiding redundant work.
> > 
> > I suggest we hold on this patch until Jordan's analysis yields results.

I tested an LRU cache at scale and I found that it provided noticable reduction in assignment
time for reasons listed in Bill's comment (services being rescheduled frequently, high cron
volume) vs removal at the end of scheduling rounds.

I posted a review with my implementation: https://reviews.apache.org/r/63199/

However, it requires the addition of another option which this patch does not.


- Jordan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63121/#review188843
-----------------------------------------------------------


On Oct. 19, 2017, 12:04 a.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63121/
> -----------------------------------------------------------
> 
> (Updated Oct. 19, 2017, 12:04 a.m.)
> 
> 
> Review request for Aurora and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This alleviates a (slow) memory leak in static offer bans, as entries are only
> removed when an offer is removed.  If a pending task group is depleted
> (either by fully scheduling the group, or terminating the job), the entry
> remains.  This issue is exacerbated when offers are held for a longer duration,
> as is proposed in https://reviews.apache.org/r/62956/.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/events/PubsubEvent.java 0637eb7f85125cf70b588d56fa7dc88130947837

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java e8334310a2a46a0ccb09ee6e4122c515892d3996

>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 2d3492d05986ef65519fd7a8c71396d055b6881f

>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 6e77857fcf209d3fe70fbd30cfd8484ea0414ee2

>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 2cfdc090ff75a63111ae146c9fe7b3542e7ac83f

>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java b88d5f13889b81ba4b0171efaf6c759d23976a39

> 
> 
> Diff: https://reviews.apache.org/r/63121/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bill Farner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message