hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gera Shegalov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5785) Derive heap size or mapreduce.*.memory.mb automatically
Date Thu, 15 Jan 2015 08:30:36 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278420#comment-14278420
] 

Gera Shegalov commented on MAPREDUCE-5785:
------------------------------------------

[~kasha], thanks for working on this patch. It's look good. I've got a few suggestions:
 
1. {{TaskAttemptImpl#getMemoryRequired}}:
It can be written as a one-liner as follows:
{code}
  private int getMemoryRequired(JobConf conf, TaskType taskType) {
    return conf.getMemoryRequired(TypeConverter.fromYarn(taskType));
  }
{code}

2. {{TestMapReduceChildJVM#testAutoHeapSize}}
We don't need to create 2 objects via {{new JobConf(new Configuration())}}, why not simply
have
{code}
   JobConf conf = new JobConf();
{code}

3. {{JobConf#getConfiguredTaskJavaOpts}}
Instead of doing {{String#equals("")}} let us use {{String#isEmpty}}
Checking {{user/adminClasspath}} for null is redundant because they are actually known to
be not null. I also wonder whether preventing a single extra space in the command line is
worth 7 lines of code :)

4. {{mapred-default.xml}}
Unsetting default values is inconsistent within the patch and outside the patch. For {{mapred.child.java.opts}}
we use an empty <value> like for many other parameters, but for {{mapreduce.*.memory.mb}}
the value element is removed by commenting it out. I think it should be done the same way
as {{mapred.child.java.opts}} , and the comment explaining the defaults is enough.

> Derive heap size or mapreduce.*.memory.mb automatically
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-5785
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mr-am, task
>            Reporter: Gera Shegalov
>            Assignee: Karthik Kambatla
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, MAPREDUCE-5785.v03.patch,
mr-5785-4.patch, mr-5785-5.patch, mr-5785-6.patch, mr-5785-7.patch
>
>
> Currently users have to set 2 memory-related configs per Job / per task type.  One first
chooses some container size map reduce.\*.memory.mb and then a corresponding maximum Java
heap size Xmx < map reduce.\*.memory.mb. This makes sure that the JVM's C-heap (native
memory + Java heap) does not exceed this mapreduce.*.memory.mb. If one forgets to tune Xmx,
MR-AM might be 
> - allocating big containers whereas the JVM will only use the default -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical ratio that
can be adjusted. Xmx is not changed automatically if provided by the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message