hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gera Shegalov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5785) Derive heap size or mapreduce.*.memory.mb automatically
Date Thu, 15 Jan 2015 08:30:36 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278420#comment-14278420

Gera Shegalov commented on MAPREDUCE-5785:

[~kasha], thanks for working on this patch. It's look good. I've got a few suggestions:
1. {{TaskAttemptImpl#getMemoryRequired}}:
It can be written as a one-liner as follows:
  private int getMemoryRequired(JobConf conf, TaskType taskType) {
    return conf.getMemoryRequired(TypeConverter.fromYarn(taskType));

2. {{TestMapReduceChildJVM#testAutoHeapSize}}
We don't need to create 2 objects via {{new JobConf(new Configuration())}}, why not simply
   JobConf conf = new JobConf();

3. {{JobConf#getConfiguredTaskJavaOpts}}
Instead of doing {{String#equals("")}} let us use {{String#isEmpty}}
Checking {{user/adminClasspath}} for null is redundant because they are actually known to
be not null. I also wonder whether preventing a single extra space in the command line is
worth 7 lines of code :)

4. {{mapred-default.xml}}
Unsetting default values is inconsistent within the patch and outside the patch. For {{mapred.child.java.opts}}
we use an empty <value> like for many other parameters, but for {{mapreduce.*.memory.mb}}
the value element is removed by commenting it out. I think it should be done the same way
as {{mapred.child.java.opts}} , and the comment explaining the defaults is enough.

> Derive heap size or mapreduce.*.memory.mb automatically
> -------------------------------------------------------
>                 Key: MAPREDUCE-5785
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mr-am, task
>            Reporter: Gera Shegalov
>            Assignee: Karthik Kambatla
>             Fix For: 3.0.0
>         Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, MAPREDUCE-5785.v03.patch,
mr-5785-4.patch, mr-5785-5.patch, mr-5785-6.patch, mr-5785-7.patch
> Currently users have to set 2 memory-related configs per Job / per task type.  One first
chooses some container size map reduce.\*.memory.mb and then a corresponding maximum Java
heap size Xmx < map reduce.\*.memory.mb. This makes sure that the JVM's C-heap (native
memory + Java heap) does not exceed this mapreduce.*.memory.mb. If one forgets to tune Xmx,
MR-AM might be 
> - allocating big containers whereas the JVM will only use the default -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical ratio that
can be adjusted. Xmx is not changed automatically if provided by the user.

This message was sent by Atlassian JIRA

View raw message