hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phan, Truong Q" <Troung.P...@team.telstra.com>
Subject Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs
Date Thu, 10 Apr 2014 05:25:43 GMT
Hi

My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce Streaming job.
I have no issue in running the MapReduce Streaming job which has an input data file of around
400Mb CSV file.
However, it is failed when I try to run the job which has 11 input data files of size 400Mb
each.
The job failed with the following error.

I appreciate for any hints or suggestions to fix this issue.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Task: attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
MAPREDUCE SCRIPT:
$ cat devices-hdfs-mr-PyIterGen-v3.sh
#!/bin/sh
export HADOOP_CMD=/usr/bin/hadoop
export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
export HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar

# Clean up the previous runs
sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device

sudo -u hdfs hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
-D mapreduce.job.reduces=160 \
-files ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py
\
-mapper ./device-mapper-v1.py \
-combiner ./device-combiner-v1.py \
-reducer ./device-reducer-v1.py \
-mapdebug ./map-debug.py \
-input /data/db/bdms1p/input/*.csv \
-output /data/db/bdms1p/device

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
OUTPUT ON THE CONSOLE:
$ ./devices-hdfs-mr-PyIterGen-v3.sh
14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval
= 86400000 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to trash at:
hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar] /tmp/streamjob781154149428893352.jar
tmpDir=null
14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process : 106
14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated.
Instead, use mapreduce.job.cache.files.filesizes
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead,
use mapreduce.job.cache.files
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead,
use mapreduce.job.reduces
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class is deprecated.
Instead, use mapreduce.job.output.value.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated.
Instead, use mapreduce.map.output.value.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.used.genericoptionsparser is deprecated.
Instead, use mapreduce.client.genericoptionsparser.used
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead,
use mapreduce.job.name
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead,
use mapreduce.input.fileinputformat.inputdir
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead,
use mapreduce.output.fileoutputformat.outputdir
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.task.debug.script is deprecated.
Instead, use mapreduce.map.debug.script
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead,
use mapreduce.job.maps
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated.
Instead, use mapreduce.job.cache.files.timestamps
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead,
use mapreduce.job.output.key.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated.
Instead, use mapreduce.map.output.key.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead,
use mapreduce.job.working.dir
14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1395628276810_0062
14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application application_1395628276810_0062
to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job: http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in uber mode : false
14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
14/04/10 10:28:10 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_0, Status
: FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with
code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
14/04/10 10:28:14 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_1, Status
: FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with
code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
14/04/10 10:28:19 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_2, Status
: FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with
code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with state FAILED
due to: Task failed task_1395628276810_0062_m_000149
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=15667286
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=21753912258
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=486
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Failed map tasks=4
                Killed map tasks=10
                Launched map tasks=176
                Other local map tasks=3
                Data-local map tasks=173
                Total time spent by all maps in occupied slots (ms)=1035708
                Total time spent by all reduces in occupied slots (ms)=0
        Map-Reduce Framework
                Map input records=164217466
                Map output records=0
                Map output bytes=0
                Map output materialized bytes=414720
                Input split bytes=23490
                Combine input records=0
                Combine output records=0
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=4750
                CPU time spent (ms)=321980
                Physical memory (bytes) snapshot=91335024640
                Virtual memory (bytes) snapshot=229819834368
                Total committed heap usage (bytes)=128240713728
        File Input Format Counters
                Bytes Read=21753888768
14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!


Thanks and Regards,
Truong Phan

Mime
View raw message