>>> You mentioned that a client sends a ping every 1/3 the session timeout. Yes, you are correct. Again to analyse your issue, we have to consider re-connection timeout also, which is "sessiontimeout/listed servers count" https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1292 https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1098 Coincidentally, in your example both heartbeat interval and re-connection interval are same as you have three servers. >>>>> It looks like C3 has taken 14 seconds to determine the disconnected event >>>>> and another 14 seconds to that it cannot connect to Server B(C3 is isolated >>>>> from B). With this info, total elapsed time is 28 secs which is less than 45 secs session timeout. Now, the client has 17 secs (45 secs - 28 secs) time period to re-establish a connection with server A, right? Could you please check whether the client is connecting to A during this period? Rakesh On Thu, Mar 2, 2017 at 6:58 PM, Tharindu Kumara wrote: > ​​ > Hi Rakesh, > > First of all thank you for the quick reply. > > >>>>> Actually, ZooKeeper client has retry mechanism. > >>>>> Client sends a ping every 1/3 the session timeout (here, 3 is the no. > of listed servers, A, B, C) and then looks for a response before another > 1/3 elapses. This allows time to reconnect to a different server (and still > maintain the session) if the connected server becomes unavailable. > > You mentioned that a client sends a ping every 1/3 the session timeout. And > 3 is the no of listed servers. > > I doubt that. Because, I am using the C Binding and after inspecting the > code it looks like that 3 is a hard coded value. > Simply no matter what the number of clients, zk client biding is always > sending a ping every 1/3 session timeout. > > Can please clarify that for me? > > Here I used a tick of 3000ms and session expiration timeout of 45000ms. > > And please find the screenshot of extacted client log outout. > > https://anonimag.es/image/JT9htnL > > It looks like C3 has taken 14 seconds to determine the disconnected event > and another 14 seconds to that it cannot connect to Server B(C3 is isolated > from B). > > > > On Thu, Mar 2, 2017 at 4:08 PM, Rakesh Radhakrishnan > wrote: > > > >>>> According to my understanding, it looks like, when a client trying > to > > >>>> connect to a server that it cannot connect due to a network > > partitioning, > > >>>> it uses a blocking call and it waits too much time trying to > > >>>> connect to a server that it cannot communicate. > > > > Actually, ZooKeeper client has retry mechanism. > > Client sends a ping every 1/3 the session timeout (here, 3 is the no. of > > listed servers, A, B, C) > > and then looks for a response before another 1/3 elapses. This allows > time > > to reconnect to a > > different server (and still maintain the session) if the connected server > > becomes unavailable. > > > > Could you grep the following log message in your client log and tell me > how > > much time C3 taken for the re-connection attempts. > > "Client session timed out, have not heard from server in " > > > > C3 might have first attempted to reconnect to B and then A. Also, need to > > check how much time C3 taken to detect connection failure from server C. > > > > Could you please share the zk client log to dig more. > > > > Rakesh > > > > > > On Thu, Mar 2, 2017 at 11:04 AM, Tharindu Kumara < > > zonik.hatkumara@gmail.com> > > wrote: > > > > > > ​ > > > 1) Could you tell me the status of Server C, is this lost connection to > > the > > > > quorum and fails to join quorum continuously as B is the Leader > ? > > > > > > Yes, B the leader. C Server is completely isolated from the Leader(B) > > > and It cannot communicate with the Leader. C cannot continuously > connect > > to > > > the > > > > > > Leader. > > > > > > > > > > 2) C3 is connected C. Please tell me the connection host string > passed > > > to > > > > this client. Does it contains all three servers info > > "A:clientport, > > > > B:clientport, C:clientport" ? > > > > > > Yes, C3's connection string contains all three servers. ("A:clientport, > > > B:clientport, C:clientport") > > > > > > > > > > 3) Please check all three servers and client C3 logs to see any > > > > inconsistencies or exceptions. > > > > > > After looking at logs, it seems when the server C isolated from the > > Leader, > > > > > > a disconnect event fires to client C3. Then it (C3) tries too much time > > to > > > connect to Server B(Leader) . > > > > > > But it cannot connect to server B, as we blocked the connection between > > > Server C and > > > > > > Server B. Basically, C3 tries more than half of the session timeout > time > > to > > > connect to Server B. > > > > > > Then after figuring out that C3 cannot to connect to Server B, it tries > > to > > > connect > > > > > > to Server A, and it connects to Server A successfully. But this is too > > > late, because > > > > > > session is already expired at the time C3 connected. > > > > > > And this happens sometimes only. Because when we specify all the > servers > > in > > > the client's > > > > > > connect string, sometimes after C3 disconnecting from Server C, instead > > of > > > trying to connect to > > > > > > Server B it connects to Server A as the first attempt. In this case the > > > client C3 connects to the > > > > > > quorum successfully before the session expiration. > > > > > > According to my understanding, it looks like, when a client trying to > > > connect to a server that it cannot > > > > > > connect due to a network partitioning, it uses a blocking call and it > > waits > > > too much time trying to > > > > > > connect to a server that it cannot communicate. > > > > > > > > > > > > > 4) ZooKeeper version used in your testing ? > > > > > > I used zookeeper 3.4.9 (current stable release) > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:48 AM, Rakesh Radhakrishnan < > rakeshr@apache.org > > > > > > wrote: > > > > > > > Hi, > > > > > > > > Could you please give few more details, > > > > > > > > ​​ > > > > 1) Could you tell me the status of Server C, is this lost connection > to > > > the > > > > quorum and fails to join quorum continuously as B is the Leader ? > > > > > > > > 2) C3 is connected C. Please tell me the connection host string > passed > > to > > > > this client. Does it contains all three servers info "A:clientport, > > > > B:clientport, C:clientport" ? > > > > > > > > 3) Please check all three servers and client C3 logs to see any > > > > inconsistencies or exceptions. > > > > > > > > 4) ZooKeeper version used in your testing ? > > > > > > > > > > > > Rakesh > > > > > > > > On Wed, Mar 1, 2017 at 4:55 PM, Tharindu Kumara < > > > zonik.hatkumara@gmail.com > > > > > > > > > wrote: > > > > > > > > > ​Recently, carried out a test to to find the behavior of clients > > when a > > > > > client is partitioned from the ensemble. > > > > > > > > > > Here I used a ensemble of 3 zookeeper servers called A, B and C. > And > > > > quorum > > > > > was set up like below. > > > > > > > > > > A - Follower > > > > > B - Leader > > > > > C - Follower​ > > > > > > > > > > A <---> B <---> C > > > > > \____________/ > > > > > > > > > > And 3 clients are connected to ensemble like below. > > > > > > > > > > C1 is connected A > > > > > C2 is connected B > > > > > C3 is connected C. > > > > > > > > > > I used iptables to remove the network link between B and C. > > > > > > > > > > command used: iptables -I INPUT -s 123.123.45.123 -j DROP > > > > > > > > > > After removing the link connections looks like below. > > > > > > > > > > A <----> B C > > > > > \____________/ > > > > > > > > > > Simply there is no way to communicate from B to C and vice versa. > > > > > > > > > > Here What I noticed is that the client connected to Zookeeper > Server > > > "C", > > > > > could not connect to the ensemble resulting a session expiration > > > timeout. > > > > > > > > > > For this experiment I used tickTime of 3000ms and client session > > > > expiration > > > > > timeout of 45000ms. And tested with different combinations also. > > > > > > > > > > Can someone please explain what is the root cause for this > behavior? > > > > > > > > > > > > > > >