hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Quintero <rfq_...@yahoo.com>
Subject Re: Out of Memory during Reduce Merge
Date Mon, 14 Jun 2010 20:52:55 GMT
Increasing heap size seems like it would simply be delaying the problem versus addressing it
(i.e. with a larger load, it would OOM again). Our memory use is tight as it is, so I don't
think we could bump it up to 2G.

As for the heapdump, I tried using the flags, but I guess I ran into HADOOP-4953, as it would
take all the arguments for the child and simply put in the heapsize and UseConcMarkSweepGC
(even when it wasn't specified, oddly enough). I reduced the arguments down to just -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/usr/local/hadoop/hprof, but it seems the processes are not dumping regardless,
not sure why. Any ideas on why? I'm going to try more general heap dump requirements, and
see if I can get some logs.

- Ruben







________________________________
From: Ted Yu <yuzhihong@gmail.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, June 13, 2010 5:06:23 PM
Subject: Re: Out of Memory during Reduce Merge

Can you increase -Xmx to 2048m ?

I suggest using the following flags:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/hadoop/hprof -XX:+UseConcMarkSweepGC
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:MaxPermSize=512m -XX:+PrintTenuringDistribution


Once you obtain heap dump, you can use jhat to analyze it.

Or you can use YourKit to profile your reducer.


On Sun, Jun 13, 2010 at 12:31 PM, Ruben Quintero <rfq_dev@yahoo.com> wrote:

We're using hadoop 0.20.2. I saw that bug report (1182), but saw that our version was already
patched for it.
>
>Are there any minor fixes or workarounds you can think of? Is there anything else I can
send your way to give you more information as to what is happening?
>
>Thanks,
>
>- Ruben
>
>
>
>
>
>
________________________________
From: Ted Yu <yuzhihong@gmail.com>
>To: mapreduce-user@hadoop.apache.org
>Sent: Sun, June 13, 2010 10:09:27 AM
>Subject: Re: Out of Memory during Reduce Merge
>
>
>>>From stack trace, they are two different problems.
>MAPREDUCE-1182 didn't solve all issues w.r.t. shuffling.
>See 'Shuffle In Memory OutOfMemoryError' discussion where OOME was reported even when
MAPREDUCE-1182 has been incorporated.
>
>
>On Sun, Jun 13, 2010 at 2:47 AM, Guo Leitao <leitao.guo@gmail.com> wrote:
>
>>>
>>Is this the same senario ?
>>https://issues.apache.org/jira/browse/MAPREDUCE-1182
>>
>>
>>
>>2010/6/12 Ruben Quintero <rfq_dev@yahoo.com>
>>>>
>>
>>
>>>>>
>>>Hi all,
>>>
>>>We have a MapReduce job writing a Lucene index (modeled closely after the example
in contrib), and we keep hitting out of memory exceptions in the reduce phase once the number
of files grows large.
>>>
>>>Here are the relevant non-default values in our mapred-site.xml:
>>>
>>>mapred.child.java.opts: -Xmx1024M -XX:+UseConcMarkSweepGC
>>>mapred.reduce.parallel.copies: 20
>>>io.sort.factor: 100
>>>io.sort.mb: 200
>>>
>>>Looking through the output logs, I think the problem occurs here:
>>>
>>>
>>>
>>>
>>>INFO org.apache.hadoop.mapred.ReduceTask: Merging 7 files, 3005057577 bytes from
disk
>>>
>>>>From what I can tell, it is trying to merge more than twice the heapsize from
disk, thus triggering the OOM. I've posted part of the log file below.
>>>
>>>We were looking at various mapreduce
>>> settings (mapred.job.shuffle.input.buffer.percent, mapred.job.shuffle.merge.percent,
mapred.inmem.merge.threshold), but weren't sure which settings might cause this issue. Does
anyone have any insight or suggestions as to where to start?
>>>
>>>Thank you,
>>>
>>>- Ruben
>>>
>>>
>>>
>>>2010-06-11 15:27:11,519 INFO org.apache.hadoop.mapred.Merger: Merging 25 sorted
segments
>>>2010-06-11 15:27:11,528 INFO org.apache.hadoop.mapred.Merger: Down to the last
merge-pass, with 25 segments left of total size: 321695767 bytes
>>>
>>>
>>>
>>>2010-06-11 15:27:12,783 INFO org.apache.hadoop.mapred.ReduceTask: Merged 25 segments,
321695767 bytes to disk to satisfy reduce memory limit
>>>2010-06-11 15:27:12,785 INFO org.apache.hadoop.mapred.ReduceTask: Merging 7 files,
3005057577 bytes from disk
>>>
>>>
>>>
>>>2010-06-11 15:27:12,799 INFO org.apache.hadoop.mapred.ReduceTask: Merging 0 segments,
0 bytes from memory into reduce
>>>2010-06-11 15:27:12,799 INFO org.apache.hadoop.mapred.Merger:
>>> Merging 7 sorted segments
>>>2010-06-11 15:27:20,891 FATAL org.apache.hadoop.mapred.TaskTracker: Error running
child : java.lang.OutOfMemoryError: Java heap space
>>>
>>>	at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342)
>>>
>>>
>>>	at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:404)
>>>	at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
>>>	at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:420)
>>>	at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>
>>>
>>>
>>>	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2298)
>>>	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:570)
>>>	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>
>>>
>>>
>>>	at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>
>>>
>>
>
>



      
Mime
View raw message