mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "PAVEL DERENDYAEV (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-8902) Mesos masters stop to respond after zookeeper cluster recover
Date Thu, 07 Jun 2018 11:41:00 GMT

    [ https://issues.apache.org/jira/browse/MESOS-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504555#comment-16504555
] 

PAVEL DERENDYAEV commented on MESOS-8902:
-----------------------------------------

Seems like it's duplicate - https://issues.apache.org/jira/projects/MESOS/issues/MESOS-8703

> Mesos masters stop to respond after zookeeper cluster recover
> -------------------------------------------------------------
>
>                 Key: MESOS-8902
>                 URL: https://issues.apache.org/jira/browse/MESOS-8902
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.4.0, 1.4.1, 1.5.0
>            Reporter: PAVEL DERENDYAEV
>            Priority: Major
>             Fix For: 1.3.0
>
>
> Hi.
> My setup consists of 3 zookeeper hosts and 3 mesos master hosts.
> It's all fine being started. But if:
>  # Stop 2 zk nodes.
>  # Wait till all masters stop to respond to HTTP requests.
>  # Start 2 zk nodes.
>  # Wait till zk cluster is recoverd.
>  # Then the only master is able to respond to HTTP requests is the leader one. The other
two do not respond at all and just log:
> 2018-05-10 14:41:16,274:1(0x7fb4737fe700):ZOO_WARN@zookeeper_interest@1597: Exceeded
deadline by 11ms
>  2018-05-10 14:42:56,377:1(0x7fb4737fe700):ZOO_WARN@zookeeper_interest@1597: Exceeded
deadline by 13ms
>  2018-05-10 14:42:56,377:1(0x7fb4989a7700):ZOO_WARN@zookeeper_interest@1597: Exceeded
deadline by 13ms
>  2018-05-10 14:43:19,783:1(0x7fb470ff9700):ZOO_WARN@zookeeper_interest@1597: Exceeded
deadline by 12ms
> Being restarted these 2 master nodes starts to respond fine.
> Reproduced on 1.4.0, 1.4.1, 1.5.0, do not reproduced on 1.3.0.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message