mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Xu <>
Subject Re: Offer operation reconciliation discussion notes
Date Wed, 23 Aug 2017 20:58:06 GMT
Yeah a reason for failed operations is probably useful for all resource
operations. It looks like the task-style status update is still the best

@xujyan <>

On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <> wrote:

> We should continue the discussion here:
> I think I forgot to mention one important reason that I went for the
> operation based reconciliation API proposal. For new operations like
> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the
> resources) if it's successful, we also need to know the failure reason if
> it fails. For instance, imagine you're creating an EBS volume by talking to
> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not from
> the CSI plugin) will be useful for scheduler to determine the next step.
> I don't think a resources based reconciliation API can address this. Maybe
> we can add both if we feel both are useful?
> Thoughts?
> - Jie
> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <> wrote:
>> Hi,
>> We had a discussion on some very early proposal (see the attached slides)
>> on providing feedback for offer operations (e.g., CREATE/DESTORY,
>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community. Here are
>> the notes I captured in the meeting:
>>    - One alternative approach discussed was to have best effort
>>    feedback, and a resources based reconciliation API allowing framework to
>>    query the resources on a given resource provider or agent. That way, we
>>    don't necessarily need the status update mechanism for offer operations,
>>    which causes complexity in the frameworks.
>>    - In the current proposal, do we need agent_id (or resource provider
>>    id) when performing reconciliation for that operation? The reason we
>>    require that in the task reconciliation case is because agent might not
>>    re-register yet during master failover.
>>    - We need to mock up the operator API for this work.
>>    - What's the order guarantee for the operations specified in one API
>>    call?
>>    - Wish list
>>       - Reservation tie to framework instead of role.
>>       - When a framework teardown, auto release resources reserved for
>>       that framework
>> If I miss anything, please reply to this thread! Thanks!
>> 64TkjpyTWarYVShtvCN4e48/edit?usp=sharing
>> - Jie

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message