hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
Date Thu, 05 Mar 2015 21:30:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349489#comment-14349489

Daryn Sharp commented on HDFS-6658:

No worries, discussion is good.  Will be even better if I can manage to get the patch up today/tomorrow.

I forgot to add that the #1 complication to removing all forms of back ref from storage to
block is full block reports.

No disagreement that we can do better than the current naive (scalability wise) designs of
the balancer, decomm, and block reports.  My goal is an initial impl with minimal changes
to use new data structures.

I did experiment with trying to scan the blocks map last fall.  I don't remember the slowdown,
but it was abysmal.  Even with a mere 60 mil blocks, I gave up waiting for it to start after
20-30 mins.  I thought about incrementally cycling through the blocks, but I quickly realized
that the bookkeeping and consistency concerns would be a rabbit hole I could neither spend
time on, nor would anyone be likely to review in a timely fashion.

> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Daryn Sharp
>         Attachments: BlockListOptimizationComparison.xlsx, BlocksMap redesign.pdf, HDFS-6658.patch,
Namenode Memory Optimizations - Block replicas list.docx
> Part of the memory consumed by every BlockInfo object in the Namenode is a linked list
of block references for every DatanodeStorageInfo (called "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the memory needed
for every block replica (when compressed oops is disabled) and in our new design the list
overhead will be per DatanodeStorageInfo and not per block replica.
> see attached design doc. for details and evaluation results.

This message was sent by Atlassian JIRA

View raw message