mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomas Barton <>
Subject Re: Slaves in a different dc/region?
Date Sun, 08 Jun 2014 18:04:01 GMT
Each Mesos slave is keeping a session, that's correct. When the connection
is lost, slave will simply reconnect

ZOO_ERROR@handle_socket_error_msg@1643: Socket [] zk
retcode=-7, errno=110(Connection timed out): connection to
timed out (exceeded timeout by 1ms)
I0606 06:25:14.611565 19666 group.cpp:415] Lost connection to ZooKeeper,
attempting to reconnect ...
2014-06-06 06:25:17,947:19643(0x7f98f9e15700):ZOO_WARN@zookeeper_interest@1557:
Exceeded deadline by 3337ms
2014-06-06 06:25:17,950:19643(0x7f98f9e15700):ZOO_INFO@check_events@1703:
initiated connection to server []
2014-06-06 06:25:18,381:19643(0x7f98f9e15700):ZOO_INFO@check_events@1750:
session establishment complete on server [],
sessionId=0x246349f15510076, negotiated timeout=10000
I0606 06:25:18.381938 19667 group.cpp:310] Group process ((5)@ reconnected to ZooKeeper

There ain't much information stored in ZooKeeper, it's pretty much just the
IP address of the master node. So, the communication won't be so intensive.
However the slave node have to send updates of assigned task's state to
Mesos master. If computing each task takes let's say few minutes and
communication delay will be 100ms it should be fine.

On 8 June 2014 17:19, David Greenberg <> wrote:

> I believe that slaves only use ZK to discover the masters initially--they
> directly communicate with them from then on, so the problem of WAN
> latencies is somewhat mitigated.
> On Sun, Jun 8, 2014 at 10:45 AM, Jordan Zimmerman <
>> wrote:
>> But if the slaves try to maintain a ZooKeeper connection there will be
>> instability. WANs aren’t very reliable and ZK clients maintain a session.
>> Do the slaves query only? What would happen if the slave lost connection to
>> ZooKeeper?
>> -Jordan
>> From: Tomas Barton
>> Reply:
>> Date: June 8, 2014 at 8:06:47 AM
>> To: user
>> Subject:  Re: Slaves in a different dc/region?
>>  Hi,
>> generally it should work. Mesos slave gets from ZooKeeper current master
>> IP address. ZooKeepers should be deployed in one datacenter (usually 3 or 5
>> instances). If you will run on Mesos
>> long term tasks it should be fine. If you would deploy e.g. Spark which
>> tends to have quite short tasks (let's say few hundreds milliseconds), the
>> computations might be slower due to longer communication.
>> It really depends on your use case, it might be good idea to have a Mesos
>> cluster in each datacenter. However you might try adjusting schedulers so
>> that they would respect slaves location, e.g. prefer allocating task from
>> one framework at the same datacenter, if the resources are available.
>> Tomas
>> On 8 June 2014 06:34, Jordan Zimmerman <>
>> wrote:
>>>  Has anyone tried running a slave in a different datacenter than the
>>> master? It seems the slaves connect to ZooKeeper. Is that correct? If so,
>>> cross-data center might not work.
>>>  Thanks!

View raw message