hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11047) Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
Date Thu, 27 Oct 2016 20:14:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613075#comment-15613075
] 

Arpit Agarwal commented on HDFS-11047:
--------------------------------------

+1 for the v002 patch pending Jenkins. Thanks [~xiaobingo].

> Remove deep copies of FinalizedReplica to alleviate heap consumption on DataNode
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11047
>                 URL: https://issues.apache.org/jira/browse/HDFS-11047
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, fs
>            Reporter: Xiaobing Zhou
>            Assignee: Xiaobing Zhou
>         Attachments: HDFS-11047.000.patch, HDFS-11047.001.patch, HDFS-11047.002.patch
>
>
> DirectoryScanner does scan by deep copying FinalizedReplica. In a deployment with 500,000+
blocks, we've seen the DN heap usage being accumulated to high peaks very quickly. Deep copies
of FinalizedReplica will make DN heap usage even worse if directory scans are scheduled more
frequently. This proposes removing unnecessary deep copies since DirectoryScanner#scan already
holds lock of dataset.
> DirectoryScanner#scan
> {code}
>     try(AutoCloseableLock lock = dataset.acquireDatasetLock()) {
>       for (Entry<String, ScanInfo[]> entry : diskReport.entrySet()) {
>         String bpid = entry.getKey();
>         ScanInfo[] blockpoolReport = entry.getValue();
>         
>         Stats statsRecord = new Stats(bpid);
>         stats.put(bpid, statsRecord);
>         LinkedList<ScanInfo> diffRecord = new LinkedList<ScanInfo>();
>         diffs.put(bpid, diffRecord);
>         
>         statsRecord.totalBlocks = blockpoolReport.length;
>         List<ReplicaInfo> bl = dataset.getFinalizedBlocks(bpid); /* deep copies
here*/
> {code}
> FsDatasetImpl#getFinalizedBlocks
> {code}
>   public List<ReplicaInfo> getFinalizedBlocks(String bpid) {
>     try (AutoCloseableLock lock = datasetLock.acquire()) {
>       ArrayList<ReplicaInfo> finalized =
>           new ArrayList<ReplicaInfo>(volumeMap.size(bpid));
>       for (ReplicaInfo b : volumeMap.replicas(bpid)) {
>         if (b.getState() == ReplicaState.FINALIZED) {
>           finalized.add(new ReplicaBuilder(ReplicaState.FINALIZED)
>               .from(b).build()); /* deep copies here*/
>         }
>       }
>       return finalized;
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message