hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8710) Always read DU value from the cached "dfsUsed" file on datanode startup
Date Fri, 03 Jul 2015 06:08:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612876#comment-14612876

Allen Wittenauer commented on HDFS-8710:

On startup is exactly when you want the du to be recalculated because there is a good chance
that the disk structure changed.

> Always read DU value from the cached "dfsUsed" file on datanode startup
> -----------------------------------------------------------------------
>                 Key: HDFS-8710
>                 URL: https://issues.apache.org/jira/browse/HDFS-8710
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Xinwei Qin 
>            Assignee: Xinwei Qin 
>         Attachments: HDFS-8710.001.patch
> Currently, DataNode will cache DU value in "dfsUsed" file termly. When DataNode starts
or restarts, it will read in the cached DU value from "dfsUsed" file if the value is less
than 600 seconds old, otherwise, it will run DU command, which is a very time-consuming operation(may
up to dozens of minutes) when DataNode has huge number of blocks.
> Since slight imprecision of dfsUsed is not critical, and the DU value will be updated
every 600 seconds (the default DU interval) after DataNode started, we can always read DU
value from the cached file (Regardless of whether this value is less than 600 seconds old
or not) and skip DU operation on DataNode startup to significantly shorten the startup time.

This message was sent by Atlassian JIRA

View raw message