hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Roberts (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
Date Wed, 16 Jul 2014 21:07:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064095#comment-14064095

Nathan Roberts commented on HDFS-6658:

I guess my argument is that (in the short or medium term) we don't actually need to reduce
the amount of RAM the NameNode uses. I've seen machines with 300 GB of RAM, and sizes continue
to increase at a steady clip every year. We do need to reduce the amount of Java heap that
the NameNode uses, since otherwise we get 10 minute long GC pauses.
This is a pretty sizable improvement though so it seems well worth considering. 
* One thing I'm concerned about is the increased RAM requirements that have been going on
in the NN. For example, moving from 0.23 releases to 2.x releases requires about 9% more RAM
(I'm assuming it's something similar when going from 1.x to 2.x). This is a pretty big deal
and can cause some folks to fail their upgrade if they were living close to the edge. In my
opinion we need to be very careful whenever we increase the RAM requirements of the NN. For
every increase there should be a corresponding optimization so the net increase stays as close
to 0 as possible. Otherwise, some upgrades will certainly fail. 
* I'm not totally convinced of the long GC argument. It's true that a worst case full-gc will
be much longer. However, isn't it also the case that we should almost never be doing worst
case full-GCs? On a large and busy NN, we see a GC greater than 2 seconds maybe once every
couple of days. Usually the big outliers are the result of a very large application doing
something bad - in which case even if you solve the GC problem, something else is liable to
cause the NN to be unresponsive. 

> Namenode memory optimization - Block replicas list 
> ---------------------------------------------------
>                 Key: HDFS-6658
>                 URL: https://issues.apache.org/jira/browse/HDFS-6658
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>         Attachments: Namenode Memory Optimizations - Block replicas list.docx
> Part of the memory consumed by every BlockInfo object in the Namenode is a linked list
of block references for every DatanodeStorageInfo (called "triplets"). 
> We propose to change the way we store the list in memory. 
> Using primitive integer indexes instead of object references will reduce the memory needed
for every block replica (when compressed oops is disabled) and in our new design the list
overhead will be per DatanodeStorageInfo and not per block replica.
> see attached design doc. for details and evaluation results.

This message was sent by Atlassian JIRA

View raw message