mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhitao Li <zhitaoli...@gmail.com>
Subject Re: Review Request 66051: Implemented operator API to grow and shrink persistent volume.
Date Tue, 10 Apr 2018 04:34:21 GMT


> On March 27, 2018, 6:58 p.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 11881-11887 (original), 11885-11890 (patched)
> > <https://reviews.apache.org/r/66051/diff/7/?file=1988982#file1988982line11889>
> >
> >     Currently, it looks like `Slave::usedResources` is the same as the "allocated
resources" from that agent.
> >     
> >     When a GROW or SHRINK operation is initiated by an operator, its consumed resource
is in a unique state in the master while that operation is pending: the volume is not allocated,
but it is "used" in some sense.
> >     
> >     One place where we use `Slave::usedResources` is when validating DESTROY operations:
https://github.com/apache/mesos/blob/21e4b45b388d0b272236b1c58313569f8a1d1fc8/src/master/master.cpp#L4841
> >     
> >     This means if the master receives a DESTROY call for a persistent volume while
there is a pending operator-initiated GROW_VOLUME call for that same volume (i.e. the operation
was forwarded to the agent for processing but the master hasn't heard back yet), we would
accept the DESTROY operation and forward it to the agent. Is this what we want?
> >     
> >     This brings something else to mind: would it be possible for the allocator to
send out an offer containing the consumed resource of a GROW_VOLUME or SHRINK_VOLUME operation,
while the operation is pending on the agent? This seems bad, but I believe it's possible.
I think this may be the first time we have a set of resources which we don't want the allocator
to offer for a period of time, but which are not currently allocated to some framework/role.
I wonder if we need some new tools in the allocator to handle this?
> 
> Chun-Hung Hsiao wrote:
>     Yeah this sounds bad. Would it help if we remove the consumed resources from the
allocator and put it back after receiving the terminal status update?
> 
> Zhitao Li wrote:
>     I agree this is problematic.
>     
>     It seems like we need to make sure several things:
>     1. `consumed` resources of a non-speculative operation can not offered to any framework
while pending;
>     2. `consumed` resources should still be visible for various validation;
>     
>     Therefore, I'm thinking about introducing a new field in `Master::Slave` struct which
represents `pendingConsumedResources` (or a function in `Slave` which can extract this info
from pending operations). We can then use it to validation of `DESTROY` call.
>     
>     Another issue I found was in `Master::apply()`, which we directly call `allocator->updateAvailable(slave->id,
{operation})`, but semantically we should be calling `allocator->updateAvailable(slave->id,
consumed, empty)` so that allocator would not see the `converted` resources until operation
confirmations comes back.

I'm separating the actual logic change of supporting non-speculative operator API into future
task and patch.


- Zhitao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66051/#review200086
-----------------------------------------------------------


On April 9, 2018, 9:32 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66051/
> -----------------------------------------------------------
> 
> (Updated April 9, 2018, 9:32 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gaston Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-4965
>     https://issues.apache.org/jira/browse/MESOS-4965
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> These operator APIs is implemented as speculative for now, but
> we plan to convert them to non-speculative in the future.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 34c9023906eca94965acc994f20e888c1f47b962 
>   src/master/master.hpp 0d9620dd0c232dc1df83477e838eeb7313bf8828 
>   src/master/validation.cpp ac2e1bb8771841ec59b3bdcdeffb6c6230680d4d 
> 
> 
> Diff: https://reviews.apache.org/r/66051/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message