hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11661) GetContentSummary uses excessive amounts of memory
Date Sat, 22 Apr 2017 00:18:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979635#comment-15979635
] 

Wei-Chiu Chuang commented on HDFS-11661:
----------------------------------------

Hey Sean, thanks for the comment.
I think there's a lightweight approach to fix the same bug without includedNodes set. I will
post a patch next week if I can do something clever.

> GetContentSummary uses excessive amounts of memory
> --------------------------------------------------
>
>                 Key: HDFS-11661
>                 URL: https://issues.apache.org/jira/browse/HDFS-11661
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.8.0, 3.0.0-alpha2
>            Reporter: Nathan Roberts
>            Priority: Blocker
>         Attachments: Heap growth.png
>
>
> ContentSummaryComputationContext::nodeIncluded() is being used to keep track of all INodes
visited during the current content summary calculation. This can be all of the INodes in the
filesystem, making for a VERY large hash table. This simply won't work on large filesystems.

> We noticed this after upgrading a namenode with ~100Million filesystem objects was spending
significantly more time in GC. Fortunately this system had some memory breathing room, other
clusters we have will not run with this additional demand on memory.
> This was added as part of HDFS-10797 as a way of keeping track of INodes that have already
been accounted for - to avoid double counting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message