hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3343) Improve metrics for DN read latency
Date Thu, 21 Jun 2012 20:24:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398825#comment-13398825

Todd Lipcon commented on HDFS-3343:

Few quick comments:
- Now that I see again how complicated {{transferToFully}} is, I think I disagree with my
earlier idea that we should copy-paste it. It seems like instead we should add an API to SocketOutputStream

{{void transferToFully(FileChannel ch, int pos, int len, MutableCounterLong transferTime,
MutableCounterLong waitTime)}}

(and have the old call delegate to that and pass null for the metrics)

- The new metrics in DataNode need better names (eg "readDataPacketFromDiskMillis" and "sendDataPacketToNetworkMillis"
or something?), and I think they should be MutableRates instead of counters, right? ie you
need to count the number of ops in addition to the sum time, or else the sum time is uninterpretable.
- I think the counter increments should be summed inside the loop locally, and then only added
to the metric at the end of each packet. Otherwise it will skew the averages
- It seems like we can add a simple unit test (or just a new assertion to an existing test
like TestPRead) that these counters have non-zero values.
- Maybe the unit for these should be microseconds instead of milliseconds? Given a lot of
reads should hit buffer cache, having more precision seems useful.
> Improve metrics for DN read latency
> -----------------------------------
>                 Key: HDFS-3343
>                 URL: https://issues.apache.org/jira/browse/HDFS-3343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wang
>         Attachments: hdfs-3343.patch
> Similar to HDFS-3170 on the write side, we should improve the metrics that are generated
on the DN for read latency. We should have separate metrics for the time spent in {{transferTo}}
vs {{waitWritable}} so that it's easy to distinguish slow local disks from slow readers on
the other end of the socket.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message