mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov" <ruklet...@gmail.com>
Subject Re: Review Request 35702: Added /reserve HTTP endpoint to the master.
Date Tue, 28 Jul 2015 12:56:36 GMT


> On July 16, 2015, 2:54 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 447
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line447>
> >
> >     Not directly related to endpoints, but to dynamic reservations in general. Do
you think it makes sense to bookkeep dynamic reservation or have an aggregating method in
`mesos::internal::master::Role`?
> 
> Michael Park wrote:
>     We have a `Role::resources()` function which aggregates all resources, and we can
filter for dynamically reserved ones by doing something like: `resources.filter(Resources::isDynamicallyReserved)`.
Is this sufficient for what you're asking about? or is there more?

Good point! I think this is close to what I had in mind. However, one thing still bothers
me: how can we hint somebody who is not very familiar with the codebase, that they can do
tricks like this? Maybe a comment in `Role` struct like

```
NOTE: You can use filters to extract specific resources, e.g. Role::resources().filter(Resources::isDynamicallyReserved).
```
But maybe it's too much (why putting such comment into the `Role` struct), what do you think?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91889
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere,
and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches
and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation
Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from
the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must
not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources
will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to
itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many
offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(...,
None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy
the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic
and forward the request to the allocator since there may be available resources to satisfy
the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`.
This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers
as possible.
> The challenges of implementing the ideal solution in the current state is described in
the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message