hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "real great.." <greatness.hardn...@gmail.com>
Subject Re: Reduce java.lang.OutOfMemoryError
Date Wed, 16 Feb 2011 15:18:24 GMT
another possibility could be increasing the memory allocated to jvm..not
sure how to do it though.

On Wed, Feb 16, 2011 at 8:46 PM, James Seigel <james@tynt.com> wrote:

> Well the first thing I'd ask to see (if we can) is the code or a
> description of what your reducer is doing.
>
> If it is holding on to objects too long or accumulating lists well
> then with the right amount of data you will run OOM.
>
> Another thought is that you've just not allocated enough mem for the
> reducer to run properly anyway. Try passing in a setting for the
> reducer that ups the memory for it. 768 perhaps.
>
> James
>
> Sent from my mobile. Please excuse the typos.
>
> On 2011-02-16, at 8:12 AM, Kelly Burkhart <kelly.burkhart@gmail.com>
> wrote:
>
> > I have had it fail with a single reducer and with 100 reducers.
> > Ultimately it needs to be funneled to a single reducer though.
> >
> > -K
> >
> > On Wed, Feb 16, 2011 at 9:02 AM, real great..
> > <greatness.hardness@gmail.com> wrote:
> >> Hi,
> >> How many reducers are you using currently?
> >> Try increasing the number or reducers.
> >> Let me know if it helps.
> >>
> >> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <
> kelly.burkhart@gmail.com>wrote:
> >>
> >>> Hello, I'm seeing frequent fails in reduce jobs with errors similar to
> >>> this:
> >>>
> >>>
> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask:
> >>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492,
> >>> decompressed len: 172488
> >>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner:
> >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure :
> >>> java.lang.OutOfMemoryError: Java heap space
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
> >>>
> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask:
> >>> Shuffling 172488 bytes (172492 raw bytes) into RAM from
> >>> attempt_201102081823_0175_m_002153_0
> >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask:
> >>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944,
> >>> decompressed len: 161940
> >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask:
> >>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365,
> >>> decompressed len: 228361
> >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task
> >>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from
> >>> attempt_201102081823_0175_m_002153_0
> >>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner:
> >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure :
> >>> java.lang.OutOfMemoryError: Java heap space
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
> >>>
> >>> Some also show this:
> >>>
> >>> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
> >>>        at
> sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63)
> >>>        at
> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811)
> >>>        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
> >>>        at
> >>>
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>        at
> >>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
> >>>
> >>> The particular job I'm running is an attempt to merge multiple time
> >>> series files into a single file.  The job tracker shows the following:
> >>>
> >>>
> >>> Kind    Num Tasks    Complete   Killed    Failed/Killed Task Attempts
> >>> map     15795        15795      0         0 / 29
> >>> reduce  100          30         70        17 / 29
> >>>
> >>> All of the files I'm reading have records with a timestamp key similar
> to:
> >>>
> >>> 2011-01-03 08:30:00.457000<tab><record>
> >>>
> >>> My map job is a simple python program that ignores rows with times <
> >>> 08:30:00 and > 15:00:00, determines the type of input row and writes
> >>> it to stdout with very minor modification.  It maintains no state and
> >>> should not use any significant memory.  My reducer is the
> >>> IdentityReducer.  The input files are individually gzipped then put
> >>> into hdfs.  The total uncompressed size of the output should be around
> >>> 150G.  Our cluster is 32 nodes each of which has 16G RAM and most of
> >>> which have two 2T drives.  We're running hadoop 0.20.2.
> >>>
> >>>
> >>> Can anyone provide some insight on how we can eliminate this issue?
> >>> I'm certain this email does not provide enough info, please let me
> >>> know what further information is needed to troubleshoot.
> >>>
> >>> Thanks in advance,
> >>>
> >>> -Kelly
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> R.V.
> >>
>



-- 
Regards,
R.V.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message