incubator-mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Hindman (Resolved) (JIRA)" <>
Subject [jira] [Resolved] (MESOS-109) Master::failoverFramework should remove existing framework offers last
Date Thu, 09 Feb 2012 22:50:57 GMT


Benjamin Hindman resolved MESOS-109.

    Resolution: Fixed
      Assignee: Benjamin Hindman
> Master::failoverFramework should remove existing framework offers last
> ----------------------------------------------------------------------
>                 Key: MESOS-109
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Hindman
>            Assignee: Benjamin Hindman
>            Priority: Critical
> It looks like there is a bug in failing over the framework. As the master goes to remove
existing offers for the framwork it invokes the allocator's "resourcesRecovered" callback.
The current implementation of that callback is to make new offers for any of those recovered
resources to existing frameworks. However, in this case, the only existing framework is currently
being failed over and has a bogus PID. Thus, when the allocator calls back into the master
to send an offer for the framework it uses said bogus PID, and those offers get sent into
> The short term fix is to remove the existing offers after all of the failover logic has
been performed (see Master::failoverFramework). The long term fix is to actually get the allocator
running independently of the master (as it's own libprocess process) so that we don't have
to think about complicated control flow interactions between the two.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message