hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently
Date Wed, 31 May 2017 18:59:04 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arpit Agarwal updated HDFS-11907:
---------------------------------
    Description: 
Currently, {{HealthMonitor#doHealthChecks}} invokes {{NameNode#monitorHealth}} which ends
up invoking {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per
second by default. And NameNodeResourceChecker#isResourceAvailable invokes {{df.getAvailable();}}
every time it is called.

Since available space information should rarely be changing dramatically at the pace of per
second. A cached value should be sufficient. i.e. only try to get the updated value when the
cached value is too old. otherwise simply return the cached value. This way df.getAvailable()
gets invoked less.

Thanks [~arpitagarwal] for the offline discussion.

  was:
Currently, {{HealthMonitor#doHealthChecks}} invokes {{NameNode#monitorHealth}} which ends
up invoking {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per
second by default. And NameNodeResourceChecker#isResourceAvailable invokes {{df.getAvailable();}}
every time it is called. Which can be a potentially very expensive operation.

Since available space information should rarely be changing dramatically at the pace of per
second. A cached value should be sufficient. i.e. only try to get the updated value when the
cached value is too old. otherwise simply return the cached value. This way df.getAvailable()
gets invoked less.

Thanks [~arpitagarwal] for the offline discussion.


> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11907
>                 URL: https://issues.apache.org/jira/browse/HDFS-11907
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes {{NameNode#monitorHealth}} which
ends up invoking {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once
per second by default. And NameNodeResourceChecker#isResourceAvailable invokes {{df.getAvailable();}}
every time it is called.
> Since available space information should rarely be changing dramatically at the pace
of per second. A cached value should be sufficient. i.e. only try to get the updated value
when the cached value is too old. otherwise simply return the cached value. This way df.getAvailable()
gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message