hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
Date Fri, 21 Nov 2014 00:20:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220285#comment-14220285

Aaron T. Myers commented on HDFS-7331:

bq. I know most of the metric systems will collect metrics and plot graphs directly. I'm unaware
of any metrics systems that can do subtraction out-of-the-box. Some pointers are appreciated.

Ganglia (and I bet basically all monitoring software) can do this out of the box. In Ganglia,
one need only specify the slope of the metric to be a counter instead of a gauge, in which
case the default is to store the values as a per-second rate. In general counters are strictly
better than calculating the rate on the producer side, since the monitoring software can derive
the latter from the former, but not the reverse. Also, the monitoring sampling frequency matters
much less if you use a counter because if you calculate the rate on the producer side, then
it's entirely possible that if your monitoring software's sample frequency is too low then
one can miss anomalies in the values and have no way to detect this.

> Add Datanode network counts to datanode jmx page
> ------------------------------------------------
>                 Key: HDFS-7331
>                 URL: https://issues.apache.org/jira/browse/HDFS-7331
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>            Priority: Minor
>         Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch, HDFS-7331.003.patch, HDFS-7331.004.patch
> Add per-datanode counts to the datanode jmx page. For example, networkErrors could be
exposed like this:
> {noformat}
>   }, {
> ...
>     "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}",
> ...
>     "NamenodeAddresses" : "{\"localhost\":\"BP-1103235125-\"}",
>     "VolumeInfo" : "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}",
>     "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e"
>   }, {
> {noformat}

This message was sent by Atlassian JIRA

View raw message