hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joep Rottinghuis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X
Date Mon, 13 Mar 2017 21:07:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922942#comment-15922942
] 

Joep Rottinghuis commented on HADOOP-14176:
-------------------------------------------

IIUC the limits in distcp-default are to indicate that a copy task doesn't require much memory.
If you cluster is configured to have a default mapper container size of 4GB for example, this
would be a waste for distcp.

It seems to me that the mapreduce.reduce.memory.mb  is irrelevant, because distcp doesn't
use any mappers.
Depending on the number of mappers (and this if ueberization is desired) then yarn.app.mapreduce.am.resource.mb
might be applicable as well. It may make sense to set the AM memory to be the same as the
mapper memory, so that small copy tasks can ueberize (map tasks run inside the am container).

What is more relevant in this case is how the JVM args are set, if those default to a larger
number than what Yarn gives the container, then it can kill the container.

One can set both these params (with the former being smaller than the latter by roughly 1/3rd):
{code}
-Dmapreduce.map.java.opts='-Xmx768m' -Dmapreduce.map.memory.mb=1024
{code}
So rather than removing the map setting and letting it default to the cluster config (which
could be much larger than the 1GB needed for distcp tasks) I think it may make sense to add
mapreduce.map.java.opts to distcp-default

> distcp reports beyond physical memory limits on 2.X
> ---------------------------------------------------
>
>                 Key: HADOOP-14176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14176
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 2.9.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>         Attachments: HADOOP-14176-branch-2.001.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : attempt_1487645941615_0037_m_000003_0,
Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_000005] is running beyond
physical memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 4.0 GB of 5 GB
virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_000005 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS)
VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c /usr/lib/jvm/java/bin/java
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx2120m -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp
-Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog
org.apache.hadoop.mapred.YarnChild 192.168.1.208 44048 attempt_1487645941615_0037_m_000003_0
5 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stdout
2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stderr
>         |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 /usr/lib/jvm/java/bin/java
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2120m -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp
-Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog
org.apache.hadoop.mapred.YarnChild 192.168.1.208 44048 attempt_1487645941615_0037_m_000003_0
5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers mapred-site.xml
> {code}
>     <property>
>         <name>mapred.job.map.memory.mb</name>
>         <value>1024</value>
>     </property>
>     <property>
>         <name>mapred.job.reduce.memory.mb</name>
>         <value>1024</value>
>     </property>
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in mapred-default.xml,
and the value is larger than setted in distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message