hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Krogen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12131) Add some of the FSNamesystem JMX values as metrics
Date Tue, 01 Aug 2017 22:48:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109935#comment-16109935
] 

Erik Krogen edited comment on HDFS-12131 at 8/1/17 10:47 PM:
-------------------------------------------------------------

Hey [~andrew.wang], thanks for continuing to help work on this. I added tests for {{VolumeFailuresTotal}},
{{EstimatedCapacityLostTotal}}, and {{DecommissioningDataNodes}} in v005 patch.

I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}}, {{NumEnteringMaintenanceDataNodes}},
and {{NumStaleStorages}}. The code to coerce the maintenance state transitions in {{TestMaintenanceState}}
relies heavily on functions from {{AdminStatesBaseTest}} which I would rather not replicate
just to test a metric output (when the underlying value is already being tested). I can't
find existing test code showing how to coerce a nonzero {{NumStaleStorages}} and again would
rather not spend too much effort trying to test just a metric value - I think time would be
better spent actually adding a test for stale storages if one does not yet exist (I was unable
to find one that I might be able to use as an example). If you have pointers let me know.


was (Author: xkrogen):
Hey [~andrew.wang], thanks for continuing to help work on this. I added tests for {{VolumeFailuresTotal}},
{{EstimatedCapacityLostTotal}}, and {{DecommissioningDataNodes}}.

I did not write tests for {{NumInMaintenance(Live|Dead)DataNodes}}, {{NumEnteringMaintenanceDataNodes}},
and {{NumStaleStorages}}. The code to coerce the maintenance state transitions in {{TestMaintenanceState}}
relies heavily on functions from {{AdminStatesBaseTest}} which I would rather not replicate
just to test a metric output (when the underlying value is already being tested). I can't
find existing test code showing how to coerce a nonzero {{NumStaleStorages}} and again would
rather not spend too much effort trying to test just a metric value - I think time would be
better spent actually adding a test for stale storages if one does not yet exist (I was unable
to find one that I might be able to use as an example). If you have pointers let me know.

> Add some of the FSNamesystem JMX values as metrics
> --------------------------------------------------
>
>                 Key: HDFS-12131
>                 URL: https://issues.apache.org/jira/browse/HDFS-12131
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namenode
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>            Priority: Minor
>         Attachments: HDFS-12131.000.patch, HDFS-12131.001.patch, HDFS-12131.002.patch,
HDFS-12131.002.patch, HDFS-12131.003.patch, HDFS-12131.004.patch, HDFS-12131.005.patch
>
>
> A number of useful numbers are emitted via the FSNamesystem JMX, but not through the
metrics system. These would be useful to be able to track over time, e.g. to alert on via
standard metrics systems or to view trends and rate changes:
> * NumLiveDataNodes
> * NumDeadDataNodes
> * NumDecomLiveDataNodes
> * NumDecomDeadDataNodes
> * NumDecommissioningDataNodes
> * NumStaleStorages
> * VolumeFailuresTotal
> * EstimatedCapacityLostTotal
> * NumInMaintenanceLiveDataNodes
> * NumInMaintenanceDeadDataNodes
> * NumEnteringMaintenanceDataNodes
> This is a simple change that just requires annotating the JMX methods with {{@Metric}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message