hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
Date Sat, 18 Oct 2014 01:36:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175788#comment-14175788
] 

Konstantin Shvachko commented on HDFS-6658:
-------------------------------------------

Pretty neat data structure, Amir. Coud be an improvement to the current structure, introduced
way back in HADOOP-1687.
With BitSet you will need about 12K of contiquous space in RAM for every 100,000 block report.
Sounds reasonable.
The only concern is that removing large number of files, which is typically done when NN gets
close to its capacity, does not free memory used by the removed replicas. It can be reused
for new references, but not anything else. Unless some type of garbage collector is introduced.
Would be interesting to see how it behaves on a cluster over time.

> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: BlockListOptimizationComparison.xlsx, HDFS-6658.patch, Namenode
Memory Optimizations - Block replicas list.docx
>
>
> Part of the memory consumed by every BlockInfo object in the Namenode is a linked list
of block references for every DatanodeStorageInfo (called "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the memory needed
for every block replica (when compressed oops is disabled) and in our new design the list
overhead will be per DatanodeStorageInfo and not per block replica.
> see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message