mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Bannier (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
Date Thu, 04 Jan 2018 12:29:00 GMT

     [ https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Bannier updated MESOS-8350:
------------------------------------
    Fix Version/s: 1.5.0

> Resource provider-capable agents not correctly synchronizing checkpointed agent resources
on reregistration
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-8350
>                 URL: https://issues.apache.org/jira/browse/MESOS-8350
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Critical
>             Fix For: 1.5.0, 1.6.0
>
>
> For resource provider-capable agents the master does not re-send checkpointed resources
on agent reregistration; instead the checkpointed resources sent as part of the {{ReregisterSlaveMessage}}
should be used.
> This is not what happens in reality. If e.g., checkpointing of an offer operation fails
and the agent fails over the checkpointed resources would, as expected, not be reflected in
the agent, but would still be assumed in the master.
> A workaround is to fail over the master which would lead to the newly elected master
bootstrapping agent state from {{ReregisterSlaveMessage}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message