Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DCBECD169 for ; Fri, 6 Jul 2012 20:34:37 +0000 (UTC) Received: (qmail 6679 invoked by uid 500); 6 Jul 2012 20:34:36 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 6640 invoked by uid 500); 6 Jul 2012 20:34:36 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 6571 invoked by uid 99); 6 Jul 2012 20:34:36 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jul 2012 20:34:36 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 8320E142853 for ; Fri, 6 Jul 2012 20:34:36 +0000 (UTC) Date: Fri, 6 Jul 2012 20:34:36 +0000 (UTC) From: "Hudson (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <821766197.15872.1341606876538.JavaMail.jiratomcat@issues-vm> In-Reply-To: <1768953951.41262.1333159525936.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3170) Add more useful metrics for write latency MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408291#comment-13408291 ] Hudson commented on HDFS-3170: ------------------------------ Integrated in Hadoop-Hdfs-trunk #1095 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1095/]) HDFS-3170. Add more useful metrics for write latency. Contributed by Matthew Jacobs. (Revision 1357970) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1357970 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java > Add more useful metrics for write latency > ----------------------------------------- > > Key: HDFS-3170 > URL: https://issues.apache.org/jira/browse/HDFS-3170 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: Matthew Jacobs > Fix For: 2.0.1-alpha > > Attachments: hdfs-3170.txt, hdfs-3170.txt, hdfs-3170.txt > > > Currently, the only write-latency related metric we expose is the total amount of time taken by opWriteBlock. This is practically useless, since (a) different blocks may be wildly different sizes, and (b) if the writer is only generating data slowly, it will make a block write take longer by no fault of the DN. I would like to propose two new metrics: > 1) *flush-to-disk time*: count how long it takes for each call to flush an incoming packet to disk (including the checksums). In most cases this will be close to 0, as it only flushes to buffer cache, but if the backing block device enters congested writeback, it can take much longer, which provides an interesting metric. > 2) *round trip to downstream pipeline node*: track the round trip latency for the part of the pipeline between the local node and its downstream neighbors. When we add a new packet to the ack queue, save the current timestamp. When we receive an ack, update the metric based on how long since we sent the original packet. This gives a metric of the total RTT through the pipeline. If we also include this metric in the ack to upstream, we can subtract the amount of time due to the later stages in the pipeline and have an accurate count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira