zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filip Deleersnijder <fi...@motum.be>
Subject Re: Leader election problems
Date Thu, 25 Jun 2015 06:39:56 GMT
Hi,

Thanks for your response.

Our application consists of 8 automatic vehicles in a warehouse setting. Those vehicles need
some consensus decisions, and that is what we use Zookeeper for.
Because vehicles can come and go at random, we installed a ZK participant on every vehicle.
The ZK client is some other piece of software that is also running on the vehicles.

Therefor : 
	- We can not choose the number of ZK-participants because it just depends on the number of
vehicles.
	- The participants communicate over Wifi
	- The client is running on the same machine, so it communicates over the local network

We are running Zookeeper version 3.4.6

Our zoo.cfg can be found below this e-mail.

Thanks in advance !

Filip

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=c:/motum/config/MASS/ZK
# the port at which the clients will connect
clientPort=2181

server.1=172.17.35.11:2888:3888
server.2=172.17.35.12:2888:3888
server.3=172.17.35.13:2888:3888
server.4=172.17.35.14:2888:3888
server.5=172.17.35.15:2888:3888
server.6=172.17.35.16:2888:3888
server.7=172.17.35.17:2888:3888
server.8=172.17.35.18:2888:3888

# The number of snapshots to retain in dataDir
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.snapRetainCount=3
autopurge.purgeInterval=1



> On 24 Jun 2015, at 18:54, Raúl Gutiérrez Segalés <rgs@itevenworks.net> wrote:
> 
> Hi,
> 
> On 24 June 2015 at 06:05, Filip Deleersnijder <filip@motum.be> wrote:
> 
>> Hi,
>> 
>> Let’s start with some description of our system :
>> 
>> - We our using a Zookeeper cluster with 8 participants for an application
>> with mobile nodes ( connected over Wifi ).
>> 
> 
> You mean the participants talk over wifi or the clients?
> 
> 
>> ( Ip of the different nodes are according to the following structure :
>> Node X has IP : 172.17.35.1X )
>> 
> 
> Why 8 and not an odd number of machines (i.e.:
> http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkMulitServerSetup
> )?
> 
> - It is not that unusual to have a node being shut-down or restarted
>> - We haven’t benchmarked the number of write operations yet, but I would
>> estimate that it would be less than 10 writes / second
>> 
> 
> What version of ZK are you using?
> 
> 
>> 
>> The problem we are having however is that sometimes(*), some instances
>> seem to be having problems with leader election.
>> Under the header “Attachment 1” below, you can find the leader election
>> times that were needed over 24h ( from 1 node ).  One average it took more
>> than 1 minute !
>> I assume that this is not normal behaviour ? ( If somebody could confirm
>> that in a 8-node cluster, these are not normal leader election times, that
>> would be nice )
>> 
>> In attachement 2 : I included an extract from the logging during a leader
>> election that took 101874ms for 1 node ( server 2 ).
>> 
>> Any help is greatly appreciated.
>> If further or more specific logging is required, please ask !
>> 
>> 
> Do you mind sharing a copy of your config file (zoo.cfg)? Thanks!
> 
> 
> -rgs


Mime
View raw message