From dev-return-82204-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Fri Aug 2 04:01:02 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0CDB8180644 for ; Fri, 2 Aug 2019 06:01:01 +0200 (CEST) Received: (qmail 78692 invoked by uid 500); 2 Aug 2019 04:01:01 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 78667 invoked by uid 99); 2 Aug 2019 04:01:00 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Aug 2019 04:01:00 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 50D3FE2E4D for ; Fri, 2 Aug 2019 04:01:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0DB7F26642 for ; Fri, 2 Aug 2019 04:01:00 +0000 (UTC) Date: Fri, 2 Aug 2019 04:01:00 +0000 (UTC) From: "Karolos Antoniadis (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ZOOKEEPER-3485) Measure reconfiguration time MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Karolos Antoniadis created ZOOKEEPER-3485: --------------------------------------------- Summary: Measure reconfiguration time Key: ZOOKEEPER-3485 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3485 Project: ZooKeeper Issue Type: Improvement Affects Versions: 3.5.5 Reporter: Karolos Antoniadis This issue is created after some initial discussion in the=C2=A0_dev_=C2=A0= mailing list (subject "Leader election logging during reconfiguration"). =C2=A0 There does not seem to be a good way to measure reconfiguration time in Zoo= Keeper. Additionally,=C2=A0reconfiguration time is mixed together with lead= er election time*.* For instance, during reconfiguration, ZooKeeper logs a= =C2=A0 {{LEADER ELECTION TOOK}}=C2=A0message even though no leader election= might takes place. =C2=A0 This can be reproduced by following these steps: 1) start a ZooKeeper cluster (e.g., 3 participants) 2) start a client that connects to some follower 3) perform a=C2=A0_reconfig_=C2=A0operation that removes the leader from t= he cluster =C2=A0 After the reconfiguration takes place, we can see that the log files of th= e remaining participants contain a "_LEADER ELECTION TOOK_" message. For ex= ample, a line that contains _2019-07-29 23:07:38,518 [myid:2] - INFO =C2=A0[QuorumPeer[myid=3D2](plain= =3D0.0.0.0:2792)(secure=3Ddisabled):Follower@75] - FOLLOWING - LEADER ELECT= ION TOOK - 57 MS_ =C2=A0 However, no leader election took place, in the sense that no server went= =C2=A0_LOOKING_=C2=A0and then started voting and sending notifications to o= ther participants as would be in a normal leader election. It seems, that b= efore the=C2=A0_reconfig_=C2=A0is committed, the participant that is going = to be the next leader is already decided (see here:=C2=A0[https://github.co= m/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zo= okeeper/server/quorum/Leader.java#L865]). =C2=A0 *Goal* of this issue/improvement is to measure in a better and more accurat= e way the time it takes for a reconfiguration to complete, as well as, to c= learly distinguish the measurement of reconfiguration versus leader electio= n. =C2=A0 =C2=A0 =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.14#76016)