Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C78118F25 for ; Thu, 3 Mar 2016 23:29:19 +0000 (UTC) Received: (qmail 37097 invoked by uid 500); 3 Mar 2016 23:29:18 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 37000 invoked by uid 500); 3 Mar 2016 23:29:18 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 36663 invoked by uid 99); 3 Mar 2016 23:29:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Mar 2016 23:29:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 479512C1F68 for ; Thu, 3 Mar 2016 23:29:18 +0000 (UTC) Date: Thu, 3 Mar 2016 23:29:18 +0000 (UTC) From: "Inigo Goiri (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-9882) Add heartbeatsTotal in Datanode metrics MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178841#comment-15178841 ] Inigo Goiri commented on HDFS-9882: ----------------------------------- I think we need to add a description to Metrics.md. Other than that, I think the patch is good (I don't see any related unit tests for this and the left checkstyle would break the style of the class). [~andrew.wang], [~arpitagarwal], do you guys think this is a useful addition? We found that the heartbeats were reporting as running smoothly but the block report processing was actually getting stuck because of the disk and delaying the heartbeats which wasn't easy to monitor. Actually, we are planning to open a separate JIRA to move some of the disk related checks to a separate thread. > Add heartbeatsTotal in Datanode metrics > --------------------------------------- > > Key: HDFS-9882 > URL: https://issues.apache.org/jira/browse/HDFS-9882 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode > Affects Versions: 2.7.2 > Reporter: Hua Liu > Assignee: Hua Liu > Priority: Minor > Attachments: 0001-HDFS-9882.Add-heartbeatsTotal-in-Datanode-metrics.patch, 0002-HDFS-9882.Add-heartbeatsTotal-in-Datanode-metrics.patch > > > Heartbeat latency only reflects the time spent on generating reports and sending reports to NN. When heartbeats are delayed due to processing commands, this latency does not help investigation. I would like to propose to add another metric counter to show the total time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)