mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Offer operation reconciliation discussion notes
Date Wed, 30 Aug 2017 18:40:48 GMT
> What are the next steps for moving this forward, Jie

James, the next step is to create a design doc for this.

Looks like we're aligned on the high level approach.

On Wed, Aug 30, 2017 at 11:40 AM, Jie Yu <yujie.jay@gmail.com> wrote:

> + Gaston and Greg
>
> Who might be working on this.
>
> On Tue, Aug 29, 2017 at 6:59 AM, James DeFelice <james@mesosphere.io>
> wrote:
>
>> What are the next steps for moving this forward, Jie? I'm very interested
>> in seeing status updates for operations land sooner than later.
>>
>> On Wed, Aug 23, 2017 at 5:55 PM, Gabriel Hartmann <gabriel@mesosphere.io>
>> wrote:
>>
>>> Please can the "reason" be the reason for the failure and NOT the reason
>>> the message was sent, e.g. "RECONCILIATION"
>>>
>>> On Wed, Aug 23, 2017 at 1:58 PM Yan Xu <xujyan@apple.com> wrote:
>>>
>>>> Yeah a reason for failed operations is probably useful for all resource
>>>> operations. It looks like the task-style status update is still the best
>>>> approach.
>>>>
>>>> ---
>>>> @xujyan <https://twitter.com/xujyan>
>>>>
>>>> On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <yujie.jay@gmail.com> wrote:
>>>>
>>>>> We should continue the discussion here:
>>>>>
>>>>> I think I forgot to mention one important reason that I went for the
>>>>> operation based reconciliation API proposal. For new operations like
>>>>> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the
>>>>> resources) if it's successful, we also need to know the failure reason
if
>>>>> it fails. For instance, imagine you're creating an EBS volume by talking
to
>>>>> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not
from
>>>>> the CSI plugin) will be useful for scheduler to determine the next step.
>>>>>
>>>>> I don't think a resources based reconciliation API can address this.
>>>>> Maybe we can add both if we feel both are useful?
>>>>>
>>>>> Thoughts?
>>>>> - Jie
>>>>>
>>>>> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <yujie.jay@gmail.com>
wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We had a discussion on some very early proposal (see the attached
>>>>>> slides) on providing feedback for offer operations (e.g., CREATE/DESTORY,
>>>>>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community.
Here are
>>>>>> the notes I captured in the meeting:
>>>>>>
>>>>>>
>>>>>>    - One alternative approach discussed was to have best effort
>>>>>>    feedback, and a resources based reconciliation API allowing framework
to
>>>>>>    query the resources on a given resource provider or agent. That
way, we
>>>>>>    don't necessarily need the status update mechanism for offer operations,
>>>>>>    which causes complexity in the frameworks.
>>>>>>    - In the current proposal, do we need agent_id (or resource
>>>>>>    provider id) when performing reconciliation for that operation?
The reason
>>>>>>    we require that in the task reconciliation case is because agent
might not
>>>>>>    re-register yet during master failover.
>>>>>>    - We need to mock up the operator API for this work.
>>>>>>    - What's the order guarantee for the operations specified in one
>>>>>>    API call?
>>>>>>    - Wish list
>>>>>>       - Reservation tie to framework instead of role.
>>>>>>       - When a framework teardown, auto release resources reserved
>>>>>>       for that framework
>>>>>>
>>>>>> If I miss anything, please reply to this thread! Thanks!
>>>>>>
>>>>>> https://docs.google.com/presentation/d/1Mef8K3aLIuzcFVc3MnAo
>>>>>> 64TkjpyTWarYVShtvCN4e48/edit?usp=sharing
>>>>>>
>>>>>> - Jie
>>>>>>
>>>>>
>>>>>
>>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message