hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1110) Namenode heap optimization - reuse objects for commonly used file names
Date Mon, 10 May 2010 05:17:54 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865661#action_12865661
] 

Suresh Srinivas commented on HDFS-1110:
---------------------------------------

470MB was based on my rough calculation. Look at detailed info I posted earlier. If we use
dictionary for names used more than 10, the savings is 1.6G in old generation of size 37G
(BTW not all the 37G was used in old gen, ~30G was used and the remaining was headroom).

> Namenode heap optimization - reuse objects for commonly used file names
> -----------------------------------------------------------------------
>
>                 Key: HDFS-1110
>                 URL: https://issues.apache.org/jira/browse/HDFS-1110
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: 0.22.0
>
>         Attachments: hdfs-1110.2.patch, hdfs-1110.patch
>
>
> There are a lot of common file names used in HDFS, mainly created by mapreduce, such
as file names starting with "part". Reusing byte[] corresponding to these recurring file names
will save significant heap space used for storing the file names in millions of INodeFile
objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message