hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ziming Dong <dzm1016397...@gmail.com>
Subject org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException
Date Sat, 14 Jan 2017 06:25:39 GMT
The job on a small dataset succeeded, but always failed on large dataset(5
times of small dataset), I'm using hadoop 2.7.3. According to the log, a
mapper task running in a container(yarnchild) on the same machine with
application master failed to read data on the same machine. Is this the
problem of too many network traffic? My hadoop configuration is mainly
following this post
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html>.
There are many `
WARN hdfs.DFSClient: Slow ReadProcessor read fields took 48668ms
(threshold=30000ms);` at the start of the job, is it the problem of network
or too large file size? I also meet connection refused error, but my
ubuntu's ufw is disabled.

 gistfile1.txt??h??ΧΆ9?A@???P  VERSIONAPPLICATION_ACL
MODIFY_APPVIEW_APP
APPLICATION_OWNEhadoop(&container_1484235446677_0001_01_430973??stderr565SLF4J:
Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/hadoop/hadoop-install/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/hadoop-user-data/tmp/nm-local-dir/usercache/hadoop/appcache/application_1484235446677_0001/filecache/11/job.jar/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
stdout0syslog70572017-01-13 14:05:52,857 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
hadoop-metrics2.properties2017-01-13 14:05:52,942 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
period at 10 second(s).2017-01-13 14:05:52,942 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics
system started2017-01-13 14:05:52,954 INFO [main]
org.apache.hadoop.mapred.YarnChild: Executing with tokens:2017-01-13
14:05:52,954 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
mapreduce.job, Service: job_1484235446677_0001, Ident:
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@2ea41516)2017-01-13
14:05:53,192 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping
for 0ms before retrying again. Got null now.2017-01-13 14:05:58,647
INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 500ms
before retrying again. Got null now.2017-01-13 14:06:03,316 INFO
[main] org.apache.hadoop.mapred.YarnChild: Sleeping for 1000ms before
retrying again. Got null now.2017-01-13 14:06:10,565 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:13,710 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:20,712 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:23,423 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:30,548 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:33,442 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:06:40,291 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:07:06,243 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:07:21,126 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:07:23,748 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:07:30,918 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:07:43,323 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:02,122 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:07,564 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:19,747 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:29,139 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:31,592 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:34,248 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:43,560 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:46,417 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:08:53,251 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:09:06,076 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:09:29,695 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:09:40,285 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:09:50,133 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:09:59,649 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:10:09,124 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:10:24,671 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:10:44,371 INFO [main]
org.apache.hadoop.mapred.YarnChild: Sleeping for 1500ms before
retrying again. Got null now.2017-01-13 14:11:12,536 WARN [main]
org.apache.hadoop.mapred.YarnChild: Exception running child :
java.io.IOException: Failed on local exception: java.io.IOException:
Connection reset by peer; Host Details : local host is:
"cp89/xxx.xx.xxx.89"; destination host is: "cp89":42085;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:242)
    at com.sun.proxy.$Proxy9.getTask(Unknown Source)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:132)Caused
by: java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:197)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:520)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1084)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)
2017-01-13 14:11:12,539 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask
metrics system...2017-01-13 14:11:12,540 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics
system stopped.2017-01-13 14:11:12,541 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics
system shutdown complete.


-- 

Ziming Dong
*https://about.me/ziming.dong <https://about.me/ziming.dong>*

Mime
View raw message