aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora ReviewBot <wfar...@apache.org>
Subject Re: Review Request 59480: Expose bin-packing options via OfferManager ordering.
Date Tue, 23 May 2017 08:24:33 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59480/#review175771
-----------------------------------------------------------


Ship it!




Master (4c0974b) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On May 23, 2017, 7:41 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59480/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 7:41 a.m.)
> 
> 
> Review request for Aurora, Santhosh Kumar Shanmugham and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch enables scalable, high-performance Scheduler bin-packing using the existing
first-fit task assigner, and it can be controlled with a simple command line argument. 
> 
> The bin-packing is only an approximation, but can lead to pretty significant improvements
in resource utilization per agent. For example, on a CPU-bound cluster with 30k+ hosts and
135k tasks (across 1k+ jobs) - we were able to reduce the number of hosts with tasks scheduled
on them to just 90%, down from 99.7% (as one would expect from randomization). So if you are
running Aurora on elastic computing and paying for machines by the minute/hour, then utilizing
this patch _could_ allow you to reduce your server footprint by as much as 10%. 
> 
> The approximation is based on the simple idea that you have the best chance of having
perfect bin-packing if you put tasks in the smallest slot available. So if you have a task
needing 8 cores and you have an 8 core and 12 core offer available - you'd always want to
put the task in the 8 core offer*. By sorting offers in OfferManager during iteration, then
a first-fit algorithm is guaranteed to match the smallest possible offer for your task and
achieves this.
> 
> * - The correct decision of course depends on the other pending tasks and the other resources
available, and more satisfactory results may also need preemption, etc.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 77376e438bd7af74c364dcd5d1b3e3f1ece2adbf 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java f2296a9d7a88be7e43124370edecfe64415df00f

>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 78255e6dfa31c4920afc0221ee60ec4f8c2a12c4

>   src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java adf7f33e4a72d87c3624f84dfe4998e20dc75fdc

>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 317a2d26d8bfa27988c60a7706b9fb3aa9b4e2a2

>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java d7addc0effb60c196cf339081ad81de541d05385

>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java 676d305d257585e53f0a05b359ba7eb11f1b23be

> 
> 
> Diff: https://reviews.apache.org/r/59480/diff/1/
> 
> 
> Testing
> -------
> 
> This has been scale-tested with production-like workloads and performs well, adding only
a few extra seconds total in TaskAssigner when applied to thousands of tasks per minute. 
> 
> There is an  overhead when scheduling tasks that have large resource requirements - as
the task assigner will first need to skip offer all the offers with low resources. In a packed
cluster, this is where the extra seconds are spent. This could be reduced by just jumping
over all the offers we know to be too small, but that decision has to map to the OfferOrder
(which adds complexity). That can be addressed in a follow-up review if needed.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message