hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
Date Thu, 26 Feb 2015 19:05:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338926#comment-14338926

Colin Patrick McCabe commented on HDFS-7836:

Thanks for taking a look at this, [~cnauroth] and [~arpitagarwal].

bq. Chris wrote: Do you intend to enforce an upper limit on growth of the off-heap allocation?
If so, do you see this as a new configuration property or as a function of an existing parameter
(i.e. equal to max heap with the consideration that the block map takes ~50% of the heap now)?
-Xmx alone will no longer be sufficient to define a ceiling for RAM utilization by the NameNode
process. This can be important in deployments that choose to co-locate other Hadoop daemons
on the same hosts as the NameNode.

That's an interesting question.  It is certainly possible to put an upper limit on the growth
of the total process heap (Java + non-Java) by using {{ulimit \-m}}, on any UNIX-like OS.
 I'm not sure that I would recommend this configuration, since the behavior for when the ulimit
is exceeded is that the process is terminated.  I think that it would be better in most cases
to have the management software running on the cluster examine heap sizes, and warn when memory
is getting low.

bq. Can I take this to mean that there will be no new native code written as part of this
project? Of course, we can always do a native code implementation later if use of a private
Sun API becomes problematic, but I wanted to understand the code footprint for the current
proposal. Avoiding native code entirely would be nice, because it reduces the scope of testing
efforts across multiple platforms.

Yeah, we are hoping to avoid writing any JNI code here.  So far, it looks good on that front.

bq. Are you proposing that off-heaping is an opt-in feature that must be explicitly enabled
in configuration, or are you proposing that off-heaping will be the new default behavior?
Arguably, jumping to off-heaping as the default could be seen as a backwards-incompatibility,
because it might be unsafe to deploy the feature without simultaneous down-tuning the NameNode
max heap size. Some might see that as backwards-incompatible with existing configurations.

I think off-heaping should be on by default.  HDFS gets enough bad press from having short-circuit
and other optimizations turned off by default... we should be a little nicer this time :)

With regard to compatibility... even if the NN max heap size is unchanged, the NN will not
actually consume that memory until it needs it, right?  So existing configuration should work
just fine.  I guess the one exception might be if you set a super-huge *minimum* (not maximum)
JVM heap size.  But even in that case, I would expect things to work, since virtual memory
would pick up the slack.  Unless you have turned off swapping, but that's not recommended.
 As a practical matter, we have found that everyone who has a big NN heap has a big NN machine,
which has much more memory than we can even use currently.  So I would not expect this to
be a problem in practice.

bq. Arpit asked: Do you have any estimates for startup time overhead due to GCs?

There are a few big users who can shave several minutes off of their NN startup time by setting
the NN minimum heap size to something large.  This prevents successive rounds of stop the
world GC where we copy everything to a new, 2x as large heap.  We will get some before and
after startup numbers later that should illustrate this even more clearly.

> BlockManager Scalability Improvements
> -------------------------------------
>                 Key: HDFS-7836
>                 URL: https://issues.apache.org/jira/browse/HDFS-7836
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: BlockManagerScalabilityImprovementsDesign.pdf
> Improvements to BlockManager scalability.

This message was sent by Atlassian JIRA

View raw message