hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently
Date Fri, 02 Jun 2017 00:24:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033960#comment-16033960
] 

Chen Liang commented on HDFS-11907:
-----------------------------------

Thanks [~andrew.wang] for the comments! We prefer not to use it here though because:
1. the change of this JIRA is about maintaining *available space* value, while DFCachingGetSpaceUsed
is to get *used space*. So we will have to make further modification to this class (or create
new) if we want to use it. 
2. seems that each instance of this class will use an extra background thread that periodically
updates the value, which seems a bit overkill to me. 

But if you do think it is better to use DFCachingGetSpaceUsed, I will try to update with another
patch.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11907
>                 URL: https://issues.apache.org/jira/browse/HDFS-11907
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, HDFS-11907.003.patch,
HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes {{NameNode#monitorHealth}} which
ends up invoking {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once
per second by default. And NameNodeResourceChecker#isResourceAvailable invokes {{df.getAvailable();}}
every time it is called.
> Since available space information should rarely be changing dramatically at the pace
of per second. A cached value should be sufficient. i.e. only try to get the updated value
when the cached value is too old. otherwise simply return the cached value. This way df.getAvailable()
gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message