hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiandan Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14045) Use different metrics in DataNode to better measure latency of heartbeat/blockReports/incrementalBlockReports of Active/Standby NN
Date Sat, 10 Nov 2018 13:11:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682395#comment-16682395
] 

Jiandan Yang  commented on HDFS-14045:
--------------------------------------

Thanks [~xkrogen] for your comments very much.
{quote}
Can we change the same of the method/parameter to something indicating it is for metrics only,
maybe like nnLatencyMetricsSuffix? It looks particularly odd to me in IncrementalBlockReportManager
right now.
{quote}
I rename {{nnLatencyMetricsSuffix}} into {{rpcMetricSuffix}},  what do you think of this name?
{quote}
I think I would prefer to see the existing methods in DataNodeMetrics changed to update both
metrics, rather than the caller having to remember to call both methods. It introduces less
possibility for the two metrics to get out of sync later.
{quote}
Very good suggestion, I have changed to update both metrics at one method in patch008, but
serviceId-nnId is needed when updating metric, so there is need to add a parameter as suffix
of metrics in the existing methods.
{quote}
I'm not sure if you should re-use the same MutableRatesWithAggregation for all of the metrics.
It seems cleaner to me to have one per metric type, e.g. one for heartbeats, one for lifeline,
and so on, but let me know if you disagree. I think this may even make it so that, if you
set up the names correctly, the MutableRatesWithAggregation can replace the existing MutableRate
while maintaining the name of the metric. Not 100% sure on this.
{quote}
I prefer to re-use MutableRatesWithAggregation for simplicity, it does not need to add fields
when adding new metrics.
{quote}
You should update Metrics.md documenting these new metrics
{quote}
Thanks for reminding to modify Metrics.md,  and newly added metrics have been written to Metrics.md
in patch008

> Use different metrics in DataNode to better measure latency of heartbeat/blockReports/incrementalBlockReports
of Active/Standby NN
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14045
>                 URL: https://issues.apache.org/jira/browse/HDFS-14045
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Major
>         Attachments: HDFS-14045.001.patch, HDFS-14045.002.patch, HDFS-14045.003.patch,
HDFS-14045.004.patch, HDFS-14045.005.patch, HDFS-14045.006.patch, HDFS-14045.007.patch
>
>
> Currently DataNode uses same metrics to measure rpc latency of NameNode, but Active and
Standby usually have different performance at the same time, especially in large cluster.
For example, rpc latency of Standby is very long when Standby is catching up editlog. We may
misunderstand the state of HDFS. Using different metrics for Active and standby can help us
obtain more precise metric data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message