hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5785) Derive task attempt JVM max heap size and io.sort.mb automatically from mapreduce.*.memory.mb
Date Fri, 04 Apr 2014 17:00:35 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960126#comment-13960126

Jason Lowe commented on MAPREDUCE-5785:

Briefly looked at the new patch, a few comments:

* I thought MAPREDUCE-5028 solved the integer sign overflow issues, so do we still need to
limit it to 1024 instead of 2047?
* Nit: IO_SORT_MB_PERCENTAGE is actually treated as a ratio rather than a percentage, otherwise
users may try to set it to something like 50 rather than 0.5 based on the description.  It
also doesn't say what it's relative to, so maybe IO_SORT_HEAP_RATIO/mapreduce.task.io.sort.heap.ratio
or IO_SORT_MB_HEAP_RATIO/mapreduce.task.io.sort.mb.heap.ratio would be more clear and consistent
with the other property?
* mapred-default description of mapreduce.task.io.sort.mb has "perecentage"
* Technically this may need to be marked as an incompatible change, as jobs that were setting
an explicit large heap due to their particular code needs but not setting any value for io.sort.mb
may now fail with OOM errors since they will implicitly get a smaller heap due to an automatically
enlarged io.sort.mb.

> Derive task attempt JVM max heap size and io.sort.mb automatically from mapreduce.*.memory.mb
> ---------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5785
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mr-am, task
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch
> Currently users have to set 2 memory-related configs per Job / per task type.  One first
chooses some container size map reduce.\*.memory.mb and then a corresponding maximum Java
heap size Xmx < map reduce.\*.memory.mb. This makes sure that the JVM's C-heap (native
memory + Java heap) does not exceed this mapreduce.*.memory.mb. If one forgets to tune Xmx,
MR-AM might be 
> - allocating big containers whereas the JVM will only use the default -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical ratio that
can be adjusted. Xmx is not changed automatically if provided by the user.

This message was sent by Atlassian JIRA

View raw message