hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
Date Tue, 06 Aug 2019 10:10:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900849#comment-16900849

Yiqun Lin commented on HDFS-14313:

[~leosun08], can you rebase the code? I applied the v012 patch in my local, it showed that
the core-default.xml cannot be applied. You may git pull the latest code and then generate
the new patch file.
error: patch failed: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:3494
error: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml: patch does
not apply
 Also can you update the description to following? From the original design, GetSpaceUsed
is not only used for telling HDFS space used.
      The class that can tell estimate much space is used in a directory.
      There are four impl classes that being supported: 
      org.apache.hadoop.fs.DU(default), org.apache.hadoop.fs.WindowsGetSpaceUsed
      org.apache.hadoop.fs.DFCachingGetSpaceUsed and
      And the ReplicaCachingGetSpaceUsed impl class only used in HDFS module.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  instead of df/du
> ----------------------------------------------------------------------------------------
>                 Key: HDFS-14313
>                 URL: https://issues.apache.org/jira/browse/HDFS-14313
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, HDFS-14313.002.patch,
HDFS-14313.003.patch, HDFS-14313.004.patch, HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch,
HDFS-14313.008.patch, HDFS-14313.009.patch, HDFS-14313.010.patch, HDFS-14313.011.patch, HDFS-14313.012.patch
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the processes
at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory is very
small and accurate. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message