mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Hartmann <gabr...@mesosphere.io>
Subject Re: Offer operation reconciliation discussion notes
Date Wed, 23 Aug 2017 21:55:33 GMT
Please can the "reason" be the reason for the failure and NOT the reason
the message was sent, e.g. "RECONCILIATION"

On Wed, Aug 23, 2017 at 1:58 PM Yan Xu <xujyan@apple.com> wrote:

> Yeah a reason for failed operations is probably useful for all resource
> operations. It looks like the task-style status update is still the best
> approach.
>
> ---
> @xujyan <https://twitter.com/xujyan>
>
> On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <yujie.jay@gmail.com> wrote:
>
>> We should continue the discussion here:
>>
>> I think I forgot to mention one important reason that I went for the
>> operation based reconciliation API proposal. For new operations like
>> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the
>> resources) if it's successful, we also need to know the failure reason if
>> it fails. For instance, imagine you're creating an EBS volume by talking to
>> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not from
>> the CSI plugin) will be useful for scheduler to determine the next step.
>>
>> I don't think a resources based reconciliation API can address this.
>> Maybe we can add both if we feel both are useful?
>>
>> Thoughts?
>> - Jie
>>
>> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <yujie.jay@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We had a discussion on some very early proposal (see the attached
>>> slides) on providing feedback for offer operations (e.g., CREATE/DESTORY,
>>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community. Here are
>>> the notes I captured in the meeting:
>>>
>>>
>>>    - One alternative approach discussed was to have best effort
>>>    feedback, and a resources based reconciliation API allowing framework to
>>>    query the resources on a given resource provider or agent. That way, we
>>>    don't necessarily need the status update mechanism for offer operations,
>>>    which causes complexity in the frameworks.
>>>    - In the current proposal, do we need agent_id (or resource provider
>>>    id) when performing reconciliation for that operation? The reason we
>>>    require that in the task reconciliation case is because agent might not
>>>    re-register yet during master failover.
>>>    - We need to mock up the operator API for this work.
>>>    - What's the order guarantee for the operations specified in one API
>>>    call?
>>>    - Wish list
>>>       - Reservation tie to framework instead of role.
>>>       - When a framework teardown, auto release resources reserved for
>>>       that framework
>>>
>>> If I miss anything, please reply to this thread! Thanks!
>>>
>>>
>>> https://docs.google.com/presentation/d/1Mef8K3aLIuzcFVc3MnAo64TkjpyTWarYVShtvCN4e48/edit?usp=sharing
>>>
>>> - Jie
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message