hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10475) Adding metrics for long FSNamesystem read and write locks
Date Fri, 02 Sep 2016 21:01:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15459579#comment-15459579
] 

Zhe Zhang commented on HDFS-10475:
----------------------------------

Thanks for confirming this Xiaoyu.

[~xkrogen] Has started some work on this direction. Now we have INFO level logging for write
lock held for over 1 seconds and read lock held for over 5 seconds (both configurable). We
also have a metric for the lock queue length.

I think we can use this JIRA to discuss what other metrics to add
# A complete op -> aggregate lock time map?
# Top _n_ types of RPC calls with longest read / write lock?
# Top _n_ paths leading to longest aggregate read / write lock?

A similar idea was [mentioned | https://issues.apache.org/jira/browse/HDFS-10713?focusedCommentId=15453607&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15453607]
under HDFS-10713. So pinging [~jingzhao] [~liuml07] [~arpitagarwal] and [~hanishakoneru] for
opinions.

> Adding metrics for long FSNamesystem read and write locks
> ---------------------------------------------------------
>
>                 Key: HDFS-10475
>                 URL: https://issues.apache.org/jira/browse/HDFS-10475
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Xiaoyu Yao
>            Assignee: Erik Krogen
>
> This is a follow up of the comment on HADOOP-12916 and [here|https://issues.apache.org/jira/browse/HDFS-9924?focusedCommentId=15310837&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15310837]
add more metrics and WARN/DEBUG logs for long FSD/FSN locking operations on namenode similar
to what we have for slow write/network WARN/metrics on datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message