hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lisheng Sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
Date Mon, 24 Jun 2019 04:14:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870797#comment-16870797
] 

Lisheng Sun commented on HDFS-14313:
------------------------------------

[~jojochuang] Thanks for your comment .I add synchronized to FsDatasetImpl#deepCopyReplica.
{code:java}
 @Override
  public synchronized Set<? extends Replica> deepCopyReplica(String bpid)
      throws IOException {
    Set<? extends Replica> replicas =
        new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
            : volumeMap.replicas(bpid));
    return replicas;
  }
{code}
I don't use FsDatasetImpl#datasetLock , Because FsDatasetImpl#addBlockPool with datasetLock
call FsDatasetImpl#deepCopyReplica in another Thread.

Please continue to help review code. Please correct me if I am wrong. Thanks.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  instead of df/du
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-14313
>                 URL: https://issues.apache.org/jira/browse/HDFS-14313
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, HDFS-14313.002.patch,
HDFS-14313.003.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the processes
at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory is very
small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message