hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Васил Григоров <vask...@abv.bg>
Subject WordCount MapReduce error
Date Wed, 22 Feb 2017 18:51:54 GMT
Hello, I've been trying to run the WordCount example provided on the website on my Windows
10 machine. I have built the latest hadoop version (2.7.3) successfully and I want to run
the code on the Local (Standalone) Mode. Thus, I have not specified any configurations, apart
from setting the JAVA_HOME path in the "hadoop-env.cmd" file. When I try to run the WordCount
file it fails to run the Reduce task but it completes the Map tasks. I get the following output:

   
    D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount>hadoop jar wc.jar
WordCount D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\input D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\output
   17/02/22 18:40:43 INFO Configuration.deprecation: session.id is deprecated. Instead, use
dfs.metrics.session-id    17/02/22 18:40:43 INFO jvm.JvmMetrics: Initializing JVM Metrics
with processName=JobTracker, sessionId=    17/02/22 18:40:43 WARN mapreduce.JobResourceUploader:
Hadoop command-line option parsing not performed. Implement the Tool interface and execute
your application with ToolRunner to remedy this.    17/02/22 18:40:43 WARN mapreduce.JobResourceUploader:
No job jar file set. 
User classes may not be found. See Job or Job#setJar(String).    17/02/22 18:40:44 INFO input.FileInputFormat:
Total input paths to process : 2    17/02/22 18:40:44 INFO mapreduce.JobSubmitter: number
of splits:2    17/02/22 18:40:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local334410887_0001
   17/02/22 18:40:45 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
   17/02/22 18:40:45 INFO mapreduce.Job: Running job: job_local334410887_0001    17/02/22
18:40:45 INFO mapred.LocalJobRunner: OutputCommitter set in config null    17/02/22 18:40:45
INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1    17/02/22
18:40:45 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
   17/02/22 18:40:45 INFO mapred.LocalJobRunner: Waiting for map tasks    17/02/22 18:40:45
INFO mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_m_000000_0    17/02/22
18:40:45 INF
 O output.FileOutputCommitter: File Output Committer Algorithm version is 1    17/02/22 18:40:45
INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
   17/02/22 18:40:45 INFO mapred.Task: 
Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3019d00f
   17/02/22 18:40:45 INFO mapred.MapTask: Processing split: file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file02:0+27
   17/02/22 18:40:45 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)    17/02/22
18:40:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100    17/02/22 18:40:45 INFO mapred.MapTask:
soft limit at 83886080    17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
   17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396; length = 6553600    17/02/22
18:40:45 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
   17/02/22 18:40:45 INFO mapred.LocalJobRunner:    17/02/22 18:40:45 INFO mapred.MapTask:
Starting flush of map output    17/02/22 18:40:45 INFO mapred.MapTask: Spilling map output
   17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufend 
 = 44; bufvoid = 104857600    17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 26214396(104857584);
kvend = 26214384(104857536); length = 13/6553600    17/02/22 18:40:45 INFO mapred.MapTask:
Finished spill 0    17/02/22 18:40:45 INFO mapred.Task: Task:attempt_local334410887_0001_m_000000_0
is done. And is in the process of committing    17/02/22 18:40:45 INFO mapred.LocalJobRunner:
map    17/02/22 18:40:45 INFO mapred.Task: Task 'attempt_local334410887_0001_m_000000_0' done.
   17/02/22 18:40:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local334410887_0001_m_000000_0
   17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_m_000001_0
   17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer Algorithm version
is 1    17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently
is supported only on Linux.    17/02/22 18:40:46 INFO mapred.Task: 
Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39ef3a7
   17/02/22 18:40:46 INFO mapred.MapTask: Processing split: file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file01:0+25
   17/02/22 18:40:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)    17/02/22
18:40:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100    17/02/22 18:40:46 INFO mapred.MapTask:
soft limit at 83886080    17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
   17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396; length = 6553600    17/02/22
18:40:46 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
   17/02/22 18:40:46 INFO mapred.LocalJobRunner:    17/02/22 18:40:46 INFO mapred.MapTask:
Starting flush of map output    17/02/22 18:40:46 INFO mapred.MapTask: Spilling map output
   17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufend =
  42; bufvoid = 104857600    17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 26214396(104857584);
kvend = 26214384(104857536); length = 13/6553600    17/02/22 18:40:46 INFO mapred.MapTask:
Finished spill 0    17/02/22 18:40:46 INFO mapred.Task: Task:attempt_local334410887_0001_m_000001_0
is done. And is in the process of committing    17/02/22 18:40:46 INFO mapred.LocalJobRunner:
map    17/02/22 18:40:46 INFO mapreduce.Job: Job job_local334410887_0001 running in uber mode
: false    17/02/22 18:40:46 INFO mapred.Task: Task 'attempt_local334410887_0001_m_000001_0'
done.    17/02/22 18:40:46 INFO mapreduce.Job: 
map 100% reduce 0%    17/02/22 18:40:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local334410887_0001_m_000001_0
   17/02/22 18:40:46 INFO mapred.LocalJobRunner: map task executor complete.    17/02/22 18:40:46
INFO mapred.LocalJobRunner: Waiting for reduce tasks    17/02/22 18:40:46 INFO mapred.LocalJobRunner:
Starting task: attempt_local334410887_0001_r_000000_0    17/02/22 18:40:46 INFO output.FileOutputCommitter:
File Output Committer Algorithm version is 1    17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree:
ProcfsBasedProcessTree currently is supported only on Linux.    17/02/22 18:40:46 INFO mapred.Task:

Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@13ac822f
   17/02/22 18:40:46 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6c4d20c4
   17/02/22 18:40:46 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616,
mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10    17/02/22 18:40:46
INFO reduce.EventFetcher: attempt_local334410887_0001_r_000000_0 Thread started: EventFetcher
for fetching Map Completion Events    17/02/22 18:40:46 INFO mapred.LocalJobRunner: reduce
task executor complete.    17/02/22 18:40:46 WARN mapred.LocalJobRunner: job_local334410887_0001
   java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in localfetcher#1    
 
 
 
 at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)    
 
 
 
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)    Caused by:
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
   
 
 
 
 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)    
 
 
 
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)    
 
 
 
 at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
   
 
 
 
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)    
 
 
 
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)    
 
 
 
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)    
 
 
 
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)    
 
 
 
 at java.lang.Thread.run(Thread.java:745)    Caused by: java.io.FileNotFoundException: D:/tmp/hadoop-Vasil%20Grigorov/mapred/local/localRunner/Vasil%20Grigorov/jobcache/job_local334410887_0001/attempt_local334410887_0001_m_000000_0/output/file.out.index
   
 
 
 
 at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:200)    
 
 
 
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)    
 
 
 
 at org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:71)    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:62)    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:57)    
 
 
 
 at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:124)
   
 
 
 
 at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102)   

 
 
 
 at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85)    17/02/22
18:40:47 INFO mapreduce.Job: Job job_local334410887_0001 failed with state FAILED due to:
NA    17/02/22 18:40:47 INFO mapreduce.Job: Counters: 18    
 
 
 
 File System Counters    
 
 
 
 
 
 
 
 FILE: Number of bytes read=1158    
 
 
 
 
 
 
 
 FILE: Number of bytes written=591978    
 
 
 
 
 
 
 
 FILE: Number of read operations=0    
 
 
 
 
 
 
 
 FILE: Number of large read operations=0    
 
 
 
 
 
 
 
 FILE: Number of write operations=0    
 
 
 
 Map-Reduce Framework    
 
 
 
 
 
 
 
 Map input records=2    
 
 
 
 
 
 
 
 Map output records=8    
 
 
 
 
 
 
 
 Map output bytes=86    
 
 
 
 
 
 
 
 Map output materialized bytes=89    
 
 
 
 
 
 
 
 Input split bytes=308    
 
 
 
 
 
 
 
 Combine input records=8    
 
 
 
 
 
 
 
 Combine output records=6    
 
 
 
 
 
 
 
 Spilled Records=6    
 
 
 
 
 
 
 
 Failed Shuffles=0    
 
 
 
 
 
 
 
 Merged Map outputs=0    
 
 
 
 
 
 
 
 GC time elapsed (ms)=0    
 
 
 
 
 
 
 
 Total committed heap usage (bytes)=574095360    
 
 
 
 File Input Format Counters    
 
 
 
 
 
 
 
 Bytes Read=52   
   I have followed every tutorial available and looked for a potention solution to the error
I get, but I have been unsuccessful. As I mentioned before, I have not set any further configurations
to any files because I want to run it in Standalone mode, rather than pseudo-distributed or
fully distributed mode. I've spent a lot of time and effort to get this far and I've hit a
brick wall with this error, so any help would be GREATLY appreciated.  
  Thank you in advance! 
Mime
View raw message