mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Peach (JIRA)" <>
Subject [jira] [Commented] (MESOS-8169) master validation incorrectly rejects slaves, buggy executorID checking
Date Wed, 08 Nov 2017 16:25:01 GMT


James Peach commented on MESOS-8169:

commit 677305f0b8a161d87d16ff156f58d2b0789c15e0 (HEAD -> master, origin/master, origin/HEAD)
Author: James Peach <>
Date:   Tue Nov 7 16:59:27 2017 -0800

    Added a test for ExecutorID validation in ReregisterSlaveMessage.

    Added a test to ensure that the ReregisterSlaveMessage validation
    correctly allows duplicate ExecutorIDs as long as they are scoped
    to different frameworks.


> master validation incorrectly rejects slaves, buggy executorID checking
> -----------------------------------------------------------------------
>                 Key: MESOS-8169
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.4.0
>            Reporter: James DeFelice
>            Assignee: James DeFelice
>              Labels: mesosphere
>             Fix For: 1.5.0
> proposed fix:
> I observed this in my environment, where I had two frameworks that used the same ExecutorID
and then triggered a master failover. The master refuses to reregister the slave because it's
not considering the owning-framework of the ExecutorID when computing ExecutorID uniqueness,
and concludes (incorrectly) that there's an erroneous duplicate executor ID:
> {code}
> W1103 00:33:42.509891 19638 master.cpp:6008] Dropping re-registration of agent at slave(1)@
because it sent an invalid re-registration: Executor has a duplicate ExecutorID 'default'
> {code}
> (yes, "default" is probably a terrible name for an ExecutorID - that's a separate discussion!)
> /cc [~neilc]

This message was sent by Atlassian JIRA

View raw message