mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "haosdent (JIRA)" <>
Subject [jira] [Commented] (MESOS-3280) Master fails to access replicated log after network partition
Date Tue, 18 Aug 2015 17:44:46 GMT


haosdent commented on MESOS-3280:

The details about how aphyr test this is this article [Call me Maybe: Chronos](
It is on HN news first page now. And other user asked in stackoverflow seems also because
of this bug. [mesos-master crash with zookeeper cluster](

> Master fails to access replicated log after network partition
> -------------------------------------------------------------
>                 Key: MESOS-3280
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.23.0
>         Environment: Zookeeper version 3.4.5--1
>            Reporter: Bernd Mathiske
>              Labels: mesosphere
> In a 5 node cluster with 3 masters and 2 slaves, and ZK on each node, when a network
partition is forced, all the masters apparently lose access to their replicated log. The leading
master halts. Unknown reasons, but presumably related to replicated log access. The others
fail to recover from the replicated log. Unknown reasons. This could have to do with ZK setup,
but it might also be a Mesos bug. 
> This was observed in a Chronos test drive scenario described in detail here:
> With setup instructions here:

This message was sent by Atlassian JIRA

View raw message