Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A49CC200D29 for ; Thu, 26 Oct 2017 22:05:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A331E1609E8; Thu, 26 Oct 2017 20:05:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E889A1609E5 for ; Thu, 26 Oct 2017 22:05:04 +0200 (CEST) Received: (qmail 9668 invoked by uid 500); 26 Oct 2017 20:05:03 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 9657 invoked by uid 99); 26 Oct 2017 20:05:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Oct 2017 20:05:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 045871A17B3 for ; Thu, 26 Oct 2017 20:05:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.001 X-Spam-Level: X-Spam-Status: No, score=-100.001 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id nHw07TpaGBoS for ; Thu, 26 Oct 2017 20:05:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 0A6995F238 for ; Thu, 26 Oct 2017 20:05:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 54DA5E0A3A for ; Thu, 26 Oct 2017 20:05:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 159A4212F7 for ; Thu, 26 Oct 2017 20:05:00 +0000 (UTC) Date: Thu, 26 Oct 2017 20:05:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ZOOKEEPER-2770) ZooKeeper slow operation log MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 26 Oct 2017 20:05:05 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221123#comment-16221123 ] ASF GitHub Bot commented on ZOOKEEPER-2770: ------------------------------------------- Github user karanmehta93 commented on the issue: https://github.com/apache/zookeeper/pull/307 Due to some reason, Jenkins didn't build it automatically. I triggered a manual build with my latest commit ID SHA1 and the results are at https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1123/ All the Findbugs warnings have been resolved, and the single failing test seems to be unrelated to my change (or possibly a flapper since it is passing on my local machine). This commit is a squashed commit based on the earlier inputs provided by everyone. Thank you! As of now, the default value for `requestWarnThresholdMs` is 10 seconds, which is too high. Users can set it according to the average latency values they typically see from the serverstats. The patch also limits logging rate, so if the value is set too low, it will print out a message every 60 seconds showing the number of high latency requests in past minute. The motivation of this JIRA is specified in description at https://issues.apache.org/jira/browse/ZOOKEEPER-2770. If you can review this PR, it would be great. @hanm @tdunning @eribeiro @skamille If we can get in this JIRA, the future plan is to work on @tdunning t-digest and use it to determine high latency requests specifically based on its type. Thanks! > ZooKeeper slow operation log > ---------------------------- > > Key: ZOOKEEPER-2770 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2770 > Project: ZooKeeper > Issue Type: Improvement > Reporter: Karan Mehta > Assignee: Karan Mehta > Attachments: ZOOKEEPER-2770.001.patch, ZOOKEEPER-2770.002.patch, ZOOKEEPER-2770.003.patch > > > ZooKeeper is a complex distributed application. There are many reasons why any given read or write operation may become slow: a software bug, a protocol problem, a hardware issue with the commit log(s), a network issue. If the problem is constant it is trivial to come to an understanding of the cause. However in order to diagnose intermittent problems we often don't know where, or when, to begin looking. We need some sort of timestamped indication of the problem. Although ZooKeeper is not a datastore, it does persist data, and can suffer intermittent performance degradation, and should consider implementing a 'slow query' log, a feature very common to services which persist information on behalf of clients which may be sensitive to latency while waiting for confirmation of successful persistence. > Log the client and request details if the server discovers, when finally processing the request, that the current time minus arrival time of the request is beyond a configured threshold. > Look at the HBase {{responseTooSlow}} feature for inspiration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)