mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Janco <jjanco....@gmail.com>
Subject Re: Review Request 51027: WIP: Track allocation candidates to bound allocator.
Date Wed, 21 Sep 2016 17:46:44 GMT


> On Sept. 12, 2016, 8:46 p.m., Benjamin Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 273-274
> > <https://reviews.apache.org/r/51027/diff/4/?file=1490734#file1490734line273>
> >
> >     It seems a bit odd that the caller has to both touch allocation candidates and
then call ensureAllocation.
> >     
> >     A simpler way to think about this may be that `allocate` has changed from a
synchronous function to an asynchronous one. I.e. the call-site here would be:
> >     
> >     ```
> >     void allocate(...); // becomes:
> >     Future<Nothing> allocate(...);
> >     
> >     // Call-site:
> >     Future<Nothing> allocated = allocate(slave.keys());
> >     
> >     // Actual allocation logic is a continuation now:
> >     Nothing _allocate(...);
> >     ```
> >     
> >     (We probably do not need the Future just yet since callers don't need to set
up continuations for now, however we might as well add it to express the asynchronous nature
of the function.)
> >     
> >     This also avoids the need for the callers to touch the data structure correctly,
which seems error-prone. They just call `allocate` and it deals with adding the SlaveIDs that
need allocations performed.
> 
> Guangya Liu wrote:
>     The reason that `the caller touch both allocation candidates and then call ensureAllocation`
is because the caller need to handle different allocation event.
>     
>     1) For some allocation event, we need to use `allocationCandidates = slaves.keys();`
such as `addFramework` etc, as we want to get all free resources from the resource pool and
allocate to the new framework.
>     2) For some allocation event, such as `addSlave`, there is no need to use `slaves.keys()`
but only adding the new added agent to the `allocationCandidates` is good enough, this can
reduce the time of the allocation cycle: Allocating only one agent is fast than allocating
all agent in an allocation cycle if there are bunch of agents.
>     3) The `allocationCandidates` needs to be cleaned after each allocation cycle.

I moved this to a WIP pending reviewers' input taking the patch in the direction Ben suggested
after some discussion with Yan.


- Jacob


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review148579
-----------------------------------------------------------


On Sept. 21, 2016, 5:42 p.m., Jacob Janco wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> -----------------------------------------------------------
> 
> (Updated Sept. 21, 2016, 5:42 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus Ma, and Jiang
Yan Xu.
> 
> 
> Bugs: MESOS-3157
>     https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> - Batched allocations are handled synchronously.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.hpp 2c31471ee0f5d6836393bf87ff9ecfd8df835013

>   src/master/allocator/mesos/hierarchical.cpp 2d56bd011f2c87c67a02d0ae467a4a537d36867e

> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 10000 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 10000 agents in 3.21345353333333mins
> allocator settled after  1.61236038333333mins
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
(290578 ms)
> 
> Sample output with 51027:
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 10000 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 10000 agents in 3.22860541666667mins
> allocator settled after  25.525654secs
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
(220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message