hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matrangola, Geoffrey" <Geoffrey_Matrang...@sra.com>
Subject Reducer running out of heap before it gets to the reduce() method
Date Wed, 07 Sep 2011 15:57:26 GMT
I think my job is running out of memory before it calls reduce() in the
reducer.  It's running with large blocks of binary data emitted from the
Maper.  Each record emitted from the mappers should be small enough to
fit in memory.  However, if it tried to somehow keep a bunch of records
for one key in memory for sorting or something like that it would exceed
the heep.   Is there any way to make sure it is not trying to use memory
for too many records at the same time?

 

It looks like it's running out of memory in the merge pass, before it
gets to my code in the reducer.  Any ideas?

 

2011-09-06 14:26:42,546 INFO org.apache.hadoop.mapred.ReduceTask:
Initiating in-memory merge with 5 segments...

2011-09-06 14:26:42,547 INFO org.apache.hadoop.mapred.Merger: Merging 5
sorted segments

2011-09-06 14:26:42,547 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 5 segments left of total size: 1015423661
bytes

2011-09-06 14:26:49,300 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0 Scheduled 1 outputs (0 slow hosts
and2 dup hosts)

2011-09-06 14:26:55,093 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0 Merge of the 5 files in-memory
complete. Local file is
/data2/mapred/local/taskTracker/matrangolag/jobcache/job_201108300523_00
09/attempt_201108300523_0009_r_000000_0/output/map_6.out of size
1015423657

2011-09-06 14:26:55,095 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0We have  19 map outputs on disk.
Triggering merge of 10 files

2011-09-06 14:26:55,109 INFO org.apache.hadoop.mapred.Merger: Merging 10
sorted segments

2011-09-06 14:26:56,230 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0 Scheduled 1 outputs (0 slow hosts
and1 dup hosts)

2011-09-06 14:26:56,234 INFO org.apache.hadoop.mapred.ReduceTask:
Initiating in-memory merge with 6 segments...

2011-09-06 14:26:56,235 INFO org.apache.hadoop.mapred.Merger: Merging 6
sorted segments

2011-09-06 14:26:56,236 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 1007886390
bytes

2011-09-06 14:26:56,255 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0 Need another 9 map output(s) where
8 is already in progress

2011-09-06 14:26:56,255 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201108300523_0009_r_000000_0 Scheduled 0 outputs (0 slow hosts
and1 dup hosts)

2011-09-06 14:27:06,112 FATAL org.apache.hadoop.mapred.Task:
attempt_201108300523_0009_r_000000_0 : Failed to merge on the local
FSjava.lang.OutOfMemoryError: Java heap space

            at
org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)

            at
org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)

            at
org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)

            at
org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:420)

            at
org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)

            at org.apache.hadoop.mapred.Merger.merge(Merger.java:60)

            at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(Reduc
eTask.java:2585)


Mime
View raw message