hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently
Date Fri, 02 Jun 2017 17:44:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035112#comment-16035112
] 

Chen Liang commented on HDFS-11907:
-----------------------------------

Thanks [~andrew.wang] for the reply!

df does seem to be a fairly cheap operation in general, but we've seen cases where we suspect
it was this call being slow under certain conditions, which we are still doing analysis. About
changing monitorHealth check interval, since we still want ZKFC process to try to contact
NameNode frequently enough to detect failures ASAP, we probably don't want to lower the frequency
from caller's side.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11907
>                 URL: https://issues.apache.org/jira/browse/HDFS-11907
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, HDFS-11907.003.patch,
HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes {{NameNode#monitorHealth}} which
ends up invoking {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once
per second by default. And NameNodeResourceChecker#isResourceAvailable invokes {{df.getAvailable();}}
every time it is called.
> Since available space information should rarely be changing dramatically at the pace
of per second. A cached value should be sufficient. i.e. only try to get the updated value
when the cached value is too old. otherwise simply return the cached value. This way df.getAvailable()
gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message