hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao <...@outlook.com>
Subject Run mr example wordcount error on hadoop-2.0.1 alpha HA
Date Fri, 14 Sep 2012 13:15:43 GMT
Hi, all,

 

         I need help.

         

         I setup my cluster use hadoop 2.0.1 alpha.

         In HA mode.

         When I run wordcount example using :

         hadoop jar /usr/lib/hadoop-2.0.1-alpha/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.1-alpha.jar
wordcount /tmp/input /tmp/output73

         I got errors below.

 

         Please help.

 

         Thanks.

 

         

         

 

 

 

[hdfs@baby20 ~]$ hadoop jar /usr/lib/hadoop-2.0.1-alpha/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.1-alpha.jar
wordcount /tmp/input /tmp/output73

12/09/14 20:49:49 INFO input.FileInputFormat: Total input paths to process : 2

12/09/14 20:49:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable

12/09/14 20:49:49 WARN snappy.LoadSnappy: Snappy native library not loaded

12/09/14 20:49:50 INFO mapreduce.JobSubmitter: number of splits:2

12/09/14 20:49:50 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar

12/09/14 20:49:50 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead,
use mapreduce.job.output.value.class

12/09/14 20:49:50 WARN conf.Configuration: mapreduce.combine.class is deprecated. Instead,
use mapreduce.job.combine.class

12/09/14 20:49:50 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use
mapreduce.job.map.class

12/09/14 20:49:50 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name

12/09/14 20:49:50 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead,
use mapreduce.job.reduce.class

12/09/14 20:49:50 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir

12/09/14 20:49:50 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir

12/09/14 20:49:50 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

12/09/14 20:49:50 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead,
use mapreduce.job.output.key.class

12/09/14 20:49:50 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use
mapreduce.job.working.dir

12/09/14 20:49:51 INFO mapred.ResourceMgrDelegate: Submitted application application_1347626975369_0001
to ResourceManager at baby20/10.1.1.40:8040

12/09/14 20:49:51 INFO mapreduce.Job: The url to track the job: http://baby20:8088/proxy/application_1347626975369_0001/

12/09/14 20:49:51 INFO mapreduce.Job: Running job: job_1347626975369_0001

12/09/14 20:50:05 INFO mapreduce.Job: Job job_1347626975369_0001 running in uber mode : false

12/09/14 20:50:05 INFO mapreduce.Job:  map 0% reduce 0%

12/09/14 20:50:20 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000001_0, Status
: FAILED

Error: Java heap space

12/09/14 20:50:20 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby18:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_0&filter=stdout

12/09/14 20:50:20 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby18:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_0&filter=stderr

12/09/14 20:50:20 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000000_0, Status
: FAILED

Error: Java heap space

12/09/14 20:50:20 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby18:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_0&filter=stdout

12/09/14 20:50:20 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby18:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_0&filter=stderr

12/09/14 20:50:30 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000000_1, Status
: FAILED

Error: Java heap space

12/09/14 20:50:30 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_1&filter=stdout

12/09/14 20:50:30 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_1&filter=stderr

12/09/14 20:50:31 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000001_1, Status
: FAILED

Error: Java heap space

12/09/14 20:50:31 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_1&filter=stdout

12/09/14 20:50:31 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_1&filter=stderr

12/09/14 20:50:39 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000000_2, Status
: FAILED

Error: Java heap space

12/09/14 20:50:39 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_2&filter=stdout

12/09/14 20:50:39 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000000_2&filter=stderr

12/09/14 20:50:39 INFO mapreduce.Job: Task Id : attempt_1347626975369_0001_m_000001_2, Status
: FAILED

Error: Java heap space

12/09/14 20:50:39 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_2&filter=stdout

12/09/14 20:50:39 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://baby16:8080/tasklog?plaintext=true&attemptid=attempt_1347626975369_0001_m_000001_2&filter=stderr

12/09/14 20:51:49 INFO mapreduce.Job: Job job_1347626975369_0001 failed with state FAILED
due to: 

12/09/14 20:51:49 INFO mapreduce.Job: Counters: 39

        File System Counters

                FILE: Number of bytes read=120

                FILE: Number of bytes written=49407

                FILE: Number of read operations=0

                FILE: Number of large read operations=0

                FILE: Number of write operations=0

                HDFS: Number of bytes read=0

                HDFS: Number of bytes written=0

                HDFS: Number of read operations=0

                HDFS: Number of large read operations=0

                HDFS: Number of write operations=0

        Job Counters 

                Failed map tasks=7

                Launched map tasks=8

                Launched reduce tasks=1

                Other local map tasks=6

                Rack-local map tasks=2

                Total time spent by all maps in occupied slots (ms)=263448

                Total time spent by all reduces in occupied slots (ms)=0

        Map-Reduce Framework

                Combine input records=0

                Combine output records=0

                Reduce input groups=0

                Reduce shuffle bytes=0

                Reduce input records=0

                Reduce output records=0

                Spilled Records=0

                Shuffled Maps =0

                Failed Shuffles=0

                Merged Map outputs=0

                GC time elapsed (ms)=44

                CPU time spent (ms)=800

                Physical memory (bytes) snapshot=125038592

                Virtual memory (bytes) snapshot=709242880

                Total committed heap usage (bytes)=85262336

        Shuffle Errors

                BAD_ID=0

                CONNECTION=0

                IO_ERROR=0

                WRONG_LENGTH=0

                WRONG_MAP=0

                WRONG_REDUCE=0

        File Output Format Counters 

                Bytes Written=0

 

 

 

 

Check AM container part log is :

 

2012-09-14 20:50:20,227 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Before Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:20,230 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
After Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:21,232 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Before Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:21,237 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
After Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:21,301 INFO [IPC Server handler 2 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
JVM with ID : jvm_1347626975369_0001_r_000004 asked for a task

2012-09-14 20:50:21,302 INFO [IPC Server handler 2 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
JVM with ID: jvm_1347626975369_0001_r_000004 given task: attempt_1347626975369_0001_r_000000_0

2012-09-14 20:50:22,240 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Before Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:22,243 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
After Scheduling: PendingReduces:0 ScheduledMaps:0 ScheduledReduces:0 AssignedMaps:2 AssignedReduces:1
completedMaps:0 completedReduces:0 containersAllocated:3 containersReleased:0 hostLocalAssigned:0
rackLocalAssigned:2 availableResources(headroom):memory: 12288

2012-09-14 20:50:22,661 FATAL [IPC Server handler 3 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Task: attempt_1347626975369_0001_m_000001_0 - exited : Java heap space

2012-09-14 20:50:22,661 INFO [IPC Server handler 3 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Diagnostics report from attempt_1347626975369_0001_m_000001_0: Error: Java heap space

2012-09-14 20:50:22,664 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diagnostics report from attempt_1347626975369_0001_m_000001_0: Error: Java heap space

2012-09-14 20:50:22,683 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1347626975369_0001_m_000001_0 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP

2012-09-14 20:50:22,684 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl:
Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1347626975369_0001_01_000003
taskAttempt attempt_1347626975369_0001_m_000001_0

2012-09-14 20:50:22,684 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl:
KILLING attempt_1347626975369_0001_m_000001_0

2012-09-14 20:50:22,744 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1347626975369_0001_m_000001_0 TaskAttempt Transitioned from FAIL_CONTAINER_CLEANUP
to FAIL_TASK_CLEANUP

2012-09-14 20:50:22,746 INFO [TaskCleaner #0] org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl:
Processing the event EventType: TASK_CLEAN

2012-09-14 20:50:22,754 WARN [TaskCleaner #0] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
Could not delete hdfs://hadoopii/tmp/output73/_temporary/1/_temporary/attempt_1347626975369_0001_m_000001_0

2012-09-14 20:50:22,762 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1347626975369_0001_m_000001_0 TaskAttempt Transitioned from FAIL_TASK_CLEANUP to FAILED

2012-09-14 20:50:22,770 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1347626975369_0001_m_000001_1 TaskAttempt Transitioned from NEW to UNASSIGNED

2012-09-14 20:50:22,770 INFO [Thread-46] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor:
1 failures on node baby18

2012-09-14 20:50:22,770 INFO [Thread-46] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator:
Added attempt_1347626975369_0001_m_000001_1 to list of failed maps

2012-09-14 20:50:23,003 FATAL [IPC Server handler 4 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Task: attempt_1347626975369_0001_m_000000_0 - exited : Java heap space

2012-09-14 20:50:23,003 INFO [IPC Server handler 4 on 62509] org.apache.hadoop.mapred.TaskAttemptListenerImpl:
Diagnostics report from attempt_1347626975369_0001

 

 

 

 

 

 

My yarn-site.xml

<?xml version="1.0"?>

 

<configuration>

 

  <property>

    <name>yarn.resourcemanager.address</name>

    <value>baby20:8040</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>baby20:8030</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.resource-tracker.address</name>

    <value>baby20:8025</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.admin.address</name>

    <value>baby20:8141</value>

  </property>

 

  <property>

    <name>yarn.resourcemanager.webapp.address</name>

    <value>baby20:8088</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce.shuffle</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.local-dirs</name>

    <value>/home/yarn/local</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.log-dirs</name>

    <value>/home/yarn/log</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.resource.memory-mb</name>

    <value>1536</value>

  </property>

 

 

  <property>

    <name>yarn.scheduler.minimum-allocation-mb</name>

    <value>64</value>

  </property>

 

  <property>

    <name>yarn.scheduler.maximum-allocation-mb</name>

    <value>1024</value>

  </property>

 

 

  <property>

    <name>yarn.nodemanager.vmem-pmem-ratio</name>

    <value>2.1</value>

  </property>

 

  <property>

    <name>yarn.app.mapreduce.am.staging-dir</name>

    <value>/home/yarn/staging</value>

  </property>

 

</configuration>

 

 

My  mapred-site.xml

 

<configuration>

 

        <property>

                <name>mapreduce.framework.name</name>

                <value>yarn</value>

        </property>

 

 

        <!-- mem limit for maps  -->

        <property>

                <name>mapreduce.map.memory.mb</name>

                <value>256</value>

        </property>

 

 

        <!-- Larger heap-size for child jvms of maps  -->

        <property>

               <name>mapreduce.map.java.opts</name>

                <value>-Xmx128m</value>

        </property>

 

        <!-- Larger resource limit for reduces  -->

        <property>

                <name>mapreduce.reduce.memory.mb</name>

                <value>512</value>

        </property>

 

        <!-- Larger heap-size for child jvms of reduces  -->

        <property>

                <name>mapreduce.reduce.java.opts</name>

                <value>-Xmx256m</value>

        </property>

 

        <!-- mapreduce.task.io.sort.mb  -->

        <property>

                <name>mapreduce.task.io.sort.mb</name>

                <value>256</value>

        </property>

 

        <!-- More streams merged at once while sorting files -->

        <property>

               <name>mapreduce.task.io.sort.factor</name>

                <value>100</value>

        </property>

 

        <!-- Higher number of parallel copies run by reduces to fetch outputs from very
large number of maps  -->

        <property>

                <name>mapreduce.reduce.shuffle.parallelcopies</name>

                <value>50</value>

        </property>

 

        <property>

                <name>mapreduce.jobhistory.address</name>

                <value>baby20:10020</value>

        </property>

 

        <property>

                <name>mapreduce.jobhistory.webapp.address</name>

                <value>baby20:19888</value>

        </property>

 

        <property>

                <name>mapreduce.jobhistory.intermediate-done-dir</name>

                <value>/home/mapred/staging/innerDone</value>

        </property>

 

        <property>

                <name>mapreduce.jobhistory.done-dir</name>

                <value>/home/mapred/staging/done</value>

        </property>

 

 

        <property>

                <name>mapred.system.dir</name>

                <value>file:/home/yarn/mapred/system</value>

                <final>true</final>

        </property>

 

        <property>

                <name>mapred.local.dir</name>

                <value>file:/home/yarn/mapred/local</value>

                <final>true</final>

        </property>

 

 

 

        <property>  

                <name>mapred.child.env</name>  

                <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-2.0.1-alpha/lib/native/Linux-amd64-64</value>
 

        </property> 

 

 

 

        <property>

                <name>mapred.task.timeout</name>

                <value>18000000</value>

        </property>

 

 

 

</configuration>

 

 


Mime
View raw message