From issues-return-45476-archive-asf-public=cust-asf.ponee.io@mesos.apache.org Thu Jan 4 13:29:05 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 31BB718077D for ; Thu, 4 Jan 2018 13:29:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 21757160C2B; Thu, 4 Jan 2018 12:29:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 70E0B160C3A for ; Thu, 4 Jan 2018 13:29:04 +0100 (CET) Received: (qmail 77175 invoked by uid 500); 4 Jan 2018 12:29:03 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 77164 invoked by uid 99); 4 Jan 2018 12:29:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jan 2018 12:29:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 22128C1F2C for ; Thu, 4 Jan 2018 12:29:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.211 X-Spam-Level: X-Spam-Status: No, score=-99.211 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id nrALzPmK2edO for ; Thu, 4 Jan 2018 12:29:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id DBC995F576 for ; Thu, 4 Jan 2018 12:29:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E0B2FE0F6E for ; Thu, 4 Jan 2018 12:29:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 32D53240EE for ; Thu, 4 Jan 2018 12:29:00 +0000 (UTC) Date: Thu, 4 Jan 2018 12:29:00 +0000 (UTC) From: "Benjamin Bannier (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-8350: ------------------------------------ Fix Version/s: 1.5.0 > Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration > ----------------------------------------------------------------------------------------------------------- > > Key: MESOS-8350 > URL: https://issues.apache.org/jira/browse/MESOS-8350 > Project: Mesos > Issue Type: Bug > Components: master > Reporter: Benjamin Bannier > Assignee: Benjamin Bannier > Priority: Critical > Fix For: 1.5.0, 1.6.0 > > > For resource provider-capable agents the master does not re-send checkpointed resources on agent reregistration; instead the checkpointed resources sent as part of the {{ReregisterSlaveMessage}} should be used. > This is not what happens in reality. If e.g., checkpointing of an offer operation fails and the agent fails over the checkpointed resources would, as expected, not be reflected in the agent, but would still be assumed in the master. > A workaround is to fail over the master which would lead to the newly elected master bootstrapping agent state from {{ReregisterSlaveMessage}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)