mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-1517) Maintain a queue of messages that arrive before the master recovers.
Date Mon, 10 Nov 2014 19:34:34 GMT

     [ https://issues.apache.org/jira/browse/MESOS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dominic Hamon updated MESOS-1517:
---------------------------------
    Labels: reliability twitter  (was: reliability)

> Maintain a queue of messages that arrive before the master recovers.
> --------------------------------------------------------------------
>
>                 Key: MESOS-1517
>                 URL: https://issues.apache.org/jira/browse/MESOS-1517
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Benjamin Mahler
>              Labels: reliability, twitter
>
> Currently when the master is recovering, we drop all incoming messages. If slaves and
frameworks knew about the leading master only once it has recovered, then we would only expect
to see messages after we've recovered.
> We previously considered enqueuing all messages through the recovery future, but this
has the downside of forcing all messages to go through the master's queue twice:
> {code}
>   // TODO(bmahler): Consider instead re-enqueing *all* messages
>   // through recover(). What are the performance implications of
>   // the additional queueing delay and the accumulated backlog
>   // of messages post-recovery?
>   if (!recovered.get().isReady()) {
>     VLOG(1) << "Dropping '" << event.message->name << "' message
since "
>             << "not recovered yet";
>     ++metrics.dropped_messages;
>     return;
>   }
> {code}
> However, an easy solution to this problem is to maintain an explicit queue of incoming
messages that gets flushed once we finish recovery. This ensures that all messages post-recovery
are processed normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message