hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
Date Thu, 26 Feb 2015 19:42:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339007#comment-14339007

Chris Nauroth commented on HDFS-7836:

bq. With regard to compatibility... even if the NN max heap size is unchanged, the NN will
not actually consume that memory until it needs it, right? So existing configuration should
work just fine. I guess the one exception might be if you set a super-huge minimum (not maximum)
JVM heap size.

Yes, that's a correct description of the behavior, but I typically deploy with {{-Xms}} and
{{-Xmx}} configured to the same value.  In my experience, this has been a more stable configuration
for long-running Java processes with a large heap like the NameNode.  It can prevent unpredictable
costly allocations later in the process lifetime, when they aren't expected.  Also, there
is no guarantee that there would be sufficient memory to satisfy that allocation later in
the process lifetime, which would result in the process terminating.  You could also run afoul
of the OOM killer after weeks of a seemingly stable run.  The JVM never frees the memory it
allocates when the heap grows, so you end up needing to plan for the worst case anyway.  I
prefer the fail fast behavior of trying to take all the memory at JVM process startup.  Of
course, this circumvents any opportunity to run with a smaller memory footprint than the max,
but I find that's generally the right trade-off for servers where it's typical to run only
a very well-defined set of processes on the box and plan strict resource allocations for each

bq. There are a few big users who can shave several minutes off of their NN startup time by
setting the NN minimum heap size to something large. This prevents successive rounds of stop
the world GC where we copy everything to a new, 2x as large heap.

That's another example of why setting {{-Xms}} equal to {{-Xmx}} can be helpful.

These are the kinds of configurations where I have a potential compatibility concern.  If
someone is running multiple daemons on the box, and they have carefully carved out exact heap
allocations by setting {{-Xms}} equal to {{-Xmx}} on each process, then moving data off-heap
leaves their eager large heap allocation unused and potentially exceeds total available RAM.

bq. As a practical matter, we have found that everyone who has a big NN heap has a big NN
machine, which has much more memory than we can even use currently. So I would not expect
this to be a problem in practice.

I'll dig into this more on my side too and try to get more details on whether or not this
really could be a likely problem in practice.  I'd be curious to hear from others in the community
too.  (There are a lot of ways to run operations for a Hadoop cluster, sometimes with differing
opinions on configuration best practices.)

> BlockManager Scalability Improvements
> -------------------------------------
>                 Key: HDFS-7836
>                 URL: https://issues.apache.org/jira/browse/HDFS-7836
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: BlockManagerScalabilityImprovementsDesign.pdf
> Improvements to BlockManager scalability.

This message was sent by Atlassian JIRA

View raw message