hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Quintero <rfq_...@yahoo.com>
Subject Re: Out of Memory during Reduce Merge
Date Tue, 15 Jun 2010 17:02:59 GMT
Yeah, I know that was an example directory. I originally had them dropping into our log directory,
then tried specifying a file, then tried leaving that argument out so it would dump in the
cwd, but none of those produced heap dumps. 

As for the flags, I was setting them in mapred.child.java.opts, but it seems that if that
gets too long (bug report shows 146 chars?), it ignores them and would only include the heapsize
and the ConcMarkSweepGC.


Getting back to the original problem, I'm still concerned by the line:

Merging 7 files, 3005057577 bytes from disk.

Again, it seems that it's trying to merge ~3G when the heapsize is only 1G. Is there a way
to have it merge less, or leave more stuff
on disk to prevent pulling something that is too big into memory?

- Ruben





________________________________
From: Ted Yu <yuzhihong@gmail.com>
To: mapreduce-user@hadoop.apache.org
Sent: Mon, June 14, 2010 6:16:56 PM
Subject: Re: Out of Memory during Reduce Merge

/usr/local/hadoop/hprof was an example directory name.
You need to find/create a path that exists on every task tracker node.

The flags can be specified in mapred.child.java.opts



On Mon, Jun 14, 2010 at 1:52 PM, Ruben Quintero <rfq_dev@yahoo.com> wrote:

Increasing heap size seems like it would simply be delaying the problem versus addressing
it (i.e. with a larger load, it would OOM again). Our memory use is tight as it is, so I don't
think we could bump it up to 2G.
>
>As for the heapdump, I tried using the flags, but I guess I ran into HADOOP-4953, as it
would take all the arguments for the child and simply put in the heapsize and UseConcMarkSweepGC
(even when it wasn't specified, oddly enough). I reduced the arguments down to just -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/usr/local/hadoop/hprof, but it seems the processes are not dumping regardless,
not sure why. Any ideas on why? I'm going to try more general heap dump requirements, and
see if I can get some
> logs.
>
>- Ruben
>
>
>
>
>
>
>
>
________________________________
From: Ted Yu <yuzhihong@gmail.com>
>To: mapreduce-user@hadoop.apache.org
>Sent: Sun, June 13, 2010 5:06:23 PM
>
>Subject: Re: Out of Memory during Reduce Merge
>
>
>>Can you increase -Xmx to 2048m ?
>
>I suggest using the following flags:
>-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/hadoop/hprof -XX:+UseConcMarkSweepGC
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:MaxPermSize=512m -XX:+PrintTenuringDistribution

>
>Once you obtain heap dump, you can use jhat to analyze it.
>
>Or you can use YourKit to profile your reducer.
>
>
>On Sun, Jun 13, 2010 at 12:31 PM, Ruben Quintero <rfq_dev@yahoo.com> wrote:
>
>We're using hadoop 0.20.2. I saw that bug report (1182), but saw that our version was
already patched for it.
>>
>>Are there any minor fixes or workarounds you can think of? Is there anything else
I can send your way to give you more information as to what is happening?
>>
>>Thanks,
>>
>>- Ruben
>>
>>
>>
>>>>
>>
>>
________________________________
From: Ted Yu <yuzhihong@gmail.com>
>>To: mapreduce-user@hadoop.apache.org
>>Sent: Sun, June 13, 2010 10:09:27 AM
>>Subject: Re: Out of Memory during Reduce Merge
>>
>>
>>>>>From stack trace, they are two different problems.
>>MAPREDUCE-1182 didn't solve all issues w.r.t. shuffling.
>>See 'Shuffle In Memory OutOfMemoryError' discussion where OOME was reported even when
MAPREDUCE-1182 has been incorporated.
>>
>>
>>On Sun, Jun 13, 2010 at 2:47 AM, Guo Leitao <leitao.guo@gmail.com> wrote:
>>
>>>>>
>>>
>>>Is this the same senario ?
>>>https://issues.apache.org/jira/browse/MAPREDUCE-1182
>>>
>>>
>>>
>>>2010/6/12 Ruben Quintero <rfq_dev@yahoo.com>
>>>>>>
>>>
>>>
>>>
>>>>>>>
>>>>
>>>>Hi all,
>>>>
>>>>We have a MapReduce job writing a Lucene index (modeled closely after the
example in contrib), and we keep hitting out of memory exceptions in the reduce phase once
the number of files grows large.
>>>>
>>>>Here are the relevant non-default values in our mapred-site.xml:
>>>>
>>>>mapred.child.java.opts: -Xmx1024M -XX:+UseConcMarkSweepGC
>>>>mapred.reduce.parallel.copies: 20
>>>>io.sort.factor: 100
>>>>io.sort.mb: 200
>>>>
>>>>Looking through the output logs, I think the problem occurs here:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>INFO org.apache.hadoop.mapred.ReduceTask: Merging 7 files, 3005057577 bytes
from disk
>>>>
>>>>>From what I can tell, it is trying to merge more than twice the heapsize
from disk, thus triggering the OOM. I've posted part of the log file below.
>>>>
>>>>We were looking at various mapreduce
>>>> settings (mapred.job.shuffle.input.buffer.percent, mapred.job.shuffle.merge.percent,
mapred.inmem.merge.threshold), but weren't sure which settings might cause this issue. Does
anyone have any insight or suggestions as to where to start?
>>>>
>>>>Thank you,
>>>>
>>>>- Ruben
>>>>
>>>>
>>>>
>>>>2010-06-11 15:27:11,519 INFO org.apache.hadoop.mapred.Merger: Merging 25 sorted
segments
>>>>2010-06-11 15:27:11,528 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 25 segments left of total size: 321695767 bytes
>>>>
>>>>
>>>>
>>>>
>>>>2010-06-11 15:27:12,783 INFO org.apache.hadoop.mapred.ReduceTask: Merged 25
segments, 321695767 bytes to disk to satisfy reduce memory limit
>>>>2010-06-11 15:27:12,785 INFO org.apache.hadoop.mapred.ReduceTask: Merging
7 files, 3005057577 bytes from disk
>>>>
>>>>
>>>>
>>>>
>>>>2010-06-11 15:27:12,799 INFO org.apache.hadoop.mapred.ReduceTask: Merging
0 segments, 0 bytes from memory into reduce
>>>>2010-06-11 15:27:12,799 INFO org.apache.hadoop.mapred.Merger:
>>>> Merging 7 sorted segments
>>>>2010-06-11 15:27:20,891 FATAL org.apache.hadoop.mapred.TaskTracker: Error
running child : java.lang.OutOfMemoryError: Java heap space
>>>>
>>>>
>>>>	at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342)
>>>>
>>>>
>>>>	at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:404)
>>>>	at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
>>>>	at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:420)
>>>>	at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>
>>>>
>>>>
>>>>
>>>>	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2298)
>>>>	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:570)
>>>>	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>>
>>>>
>>>>
>>>>
>>>>	at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>
>>>>
>>>
>>
>>
>
>



      
Mime
View raw message