hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Shuffle In Memory OutOfMemoryError
Date Sat, 08 May 2010 03:25:15 GMT
You need to lower mapred.job.shuffle.input.buffer.percent to 20% or 25%.
I didn't have time recently to find the root cause in 0.20.2

I was told that shuffle has been rewritten in 0.21
You may give it a try.

On Fri, May 7, 2010 at 8:08 PM, Bo Shi <bshi@visiblemeasures.com> wrote:

> Hey Ted, any further insights on this?  We're encountering a similar
> issue (on CD2).  I'll be applying MAPREDUCE-1182 to see if that
> resolves our case but it sounds like that JIRA didn't completely
> eliminate the problem for some folks.
>
> On Wed, Mar 10, 2010 at 11:54 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > I pressed send key a bit early.
> >
> > I will have to dig a bit deeper.
> > Hopefully someone can find reader.close() call after which I will look
> for
> > another possible root cause :-)
> >
> >
> > On Wed, Mar 10, 2010 at 7:48 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> >> Thanks to Andy for the log he provided.
> >>
> >> You can see from the log below that size increased steadily from
> 341535057
> >> to 408181692, approaching maxSize. Then OOME:
> >>
> >>
> >> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask:
> reserve:
> >> pos=start requestedSize=3893000 size=341535057 numPendingRequests=0
> >> maxSize=417601952
> >> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask:
> reserve:
> >> pos=end requestedSize=3893000 size=345428057 numPendingRequests=0
> >> maxSize=417601952
> >> ...
> >> 2010-03-10 18:38:35,950 INFO org.apache.hadoop.mapred.ReduceTask:
> reserve:
> >> pos=end requestedSize=635753 size=408181692 numPendingRequests=0
> >> maxSize=417601952
> >> 2010-03-10 18:38:36,603 INFO org.apache.hadoop.mapred.ReduceTask: Task
> >> attempt_201003101826_0001_r_000004_0: Failed fetch #1 from
> >> attempt_201003101826_0001_m_000875_0
> >>
> >> 2010-03-10 18:38:36,603 WARN org.apache.hadoop.mapred.ReduceTask:
> >> attempt_201003101826_0001_r_000004_0 adding host
> hd17.dfs.returnpath.netto penalty box, next contact in 4 seconds
> >> 2010-03-10 18:38:36,604 INFO org.apache.hadoop.mapred.ReduceTask:
> >> attempt_201003101826_0001_r_000004_0: Got 1 map-outputs from previous
> >> failures
> >> 2010-03-10 18:38:36,605 FATAL org.apache.hadoop.mapred.TaskRunner:
> >> attempt_201003101826_0001_r_000004_0 : Map output copy failure :
> >> java.lang.OutOfMemoryError: Java heap space
> >>         at
> >>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1513)
> >>         at
> >>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1413)
> >>         at
> >>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1266)
> >>         at
> >>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1200)
> >>
> >> Looking at the call to unreserve() in ReduceTask, two were for
> IOException
> >> and the other was for Sanity check (line 1557). Meaning they wouldn't be
> >> called in normal execution path.
> >>
> >> I see one call in IFile.InMemoryReader close() method:
> >>       // Inform the RamManager
> >>       ramManager.unreserve(bufferSize);
> >>
> >> And InMemoryReader is used in createInMemorySegments():
> >>           Reader<K, V> reader =
> >>             new InMemoryReader<K, V>(ramManager, mo.mapAttemptId,
> >>                                      mo.data, 0, mo.data.length);
> >>
> >> But I don't see reader.close() in ReduceTask file.
> >>
> >
> >
> >> On Wed, Mar 10, 2010 at 3:34 PM, Chris Douglas <chrisdo@yahoo-inc.com
> >wrote:
> >>
> >>> I don't think this OOM is a framework bug per se, and given the
> >>> rewrite/refactoring of the shuffle in MAPREDUCE-318 (in 0.21), tuning
> the
> >>> 0.20 shuffle semantics is likely not worthwhile (though data informing
> >>> improvements to trunk would be excellent). Most likely (and
> tautologically),
> >>> ReduceTask simply requires more memory than is available and the job
> failure
> >>> can be avoided by either 0) increasing the heap size or 1) lowering
> >>> mapred.shuffle.input.buffer.percent. Most of the tasks we run have a
> heap of
> >>> 1GB. For a reduce fetching >200k map outputs, that's a reasonable, even
> >>> stingy amount of space. -C
> >>>
> >>>
> >>> On Mar 10, 2010, at 5:26 AM, Ted Yu wrote:
> >>>
> >>>  I verified that size and maxSize are long. This means MR-1182 didn't
> >>>> resolve
> >>>> Andy's issue.
> >>>>
> >>>> According to Andy:
> >>>> At the beginning of the job there are 209,754 pending map tasks and
32
> >>>> pending reduce tasks
> >>>>
> >>>> My guess is that GC wasn't reclaiming memory fast enough, leading to
> OOME
> >>>> because of large number of in-memory shuffle candidates.
> >>>>
> >>>> My suggestion for Andy would be to:
> >>>> 1. add -*verbose*:*gc as JVM parameter
> >>>> 2. modify reserve() slightly to calculate the maximum outstanding
> >>>> numPendingRequests and print the maximum.
> >>>>
> >>>> Based on the output from above two items, we can discuss solution.
> >>>> My intuition is to place upperbound on numPendingRequests beyond which
> >>>> canFitInMemory() returns false.
> >>>> *
> >>>> My two cents.
> >>>>
> >>>> On Tue, Mar 9, 2010 at 11:51 PM, Christopher Douglas
> >>>> <chrisdo@yahoo-inc.com>wrote:
> >>>>
> >>>>  That section of code is unmodified in MR-1182. See the patches/svn
> log.
> >>>>> -C
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>>
> >>>>> On Mar 9, 2010, at 7:44 PM, "Ted Yu" <yuzhihong@gmail.com>
wrote:
> >>>>>
> >>>>> I just downloaded hadoop-0.20.2 tar ball from cloudera mirror.
> >>>>>
> >>>>>> This is what I see in ReduceTask (line 999):
> >>>>>>   public synchronized boolean reserve(int requestedSize, InputStream
> >>>>>> in)
> >>>>>>
> >>>>>>   throws InterruptedException {
> >>>>>>     // Wait till the request can be fulfilled...
> >>>>>>     while ((size + requestedSize) > maxSize) {
> >>>>>>
> >>>>>> I don't see the fix from MR-1182.
> >>>>>>
> >>>>>> That's why I suggested to Andy that he manually apply MR-1182.
> >>>>>>
> >>>>>> Cheers
> >>>>>>
> >>>>>> On Tue, Mar 9, 2010 at 5:01 PM, Andy Sautins <
> >>>>>> andy.sautins@returnpath.net
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>  Thanks Christopher.
> >>>>>>>
> >>>>>>> The heap size for reduce tasks is configured to be 640M
(
> >>>>>>> mapred.child.java.opts set to -Xmx640m ).
> >>>>>>>
> >>>>>>> Andy
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Christopher Douglas [mailto:chrisdo@yahoo-inc.com]
> >>>>>>> Sent: Tuesday, March 09, 2010 5:19 PM
> >>>>>>> To: common-user@hadoop.apache.org
> >>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
> >>>>>>>
> >>>>>>> No, MR-1182 is included in 0.20.2
> >>>>>>>
> >>>>>>> What heap size have you set for your reduce tasks? -C
> >>>>>>>
> >>>>>>> Sent from my iPhone
> >>>>>>>
> >>>>>>> On Mar 9, 2010, at 2:34 PM, "Ted Yu" <yuzhihong@gmail.com>
wrote:
> >>>>>>>
> >>>>>>> Andy:
> >>>>>>>
> >>>>>>>> You need to manually apply the patch.
> >>>>>>>>
> >>>>>>>> Cheers
> >>>>>>>>
> >>>>>>>> On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins <
> >>>>>>>>
> >>>>>>>>  andy.sautins@returnpath.net
> >>>>>>>
> >>>>>>>  wrote:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>  Thanks Ted.  My understanding is that MAPREDUCE-1182
is included
> >>>>>>>>> in the
> >>>>>>>>> 0.20.2 release.  We upgraded our cluster to 0.20.2
this weekend
> and
> >>>>>>>>> re-ran
> >>>>>>>>> the same job scenarios.  Running with
> mapred.reduce.parallel.copies
> >>>>>>>>> set to 1
> >>>>>>>>> and continue to have the same Java heap space error.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Ted Yu [mailto:yuzhihong@gmail.com]
> >>>>>>>>> Sent: Tuesday, March 09, 2010 12:56 PM
> >>>>>>>>> To: common-user@hadoop.apache.org
> >>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
> >>>>>>>>>
> >>>>>>>>> This issue has been resolved in
> >>>>>>>>> http://issues.apache.org/jira/browse/MAPREDUCE-1182
> >>>>>>>>>
> >>>>>>>>> Please apply the patch
> >>>>>>>>> M1182-1v20.patch<
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch
> >>>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>>  On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins
<
> >>>>>>>>>
> >>>>>>>>>  andy.sautins@returnpath.net
> >>>>>>>>
> >>>>>>>
> >>>>>>>  wrote:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  Thanks Ted.  Very helpful.  You are correct that
I misunderstood
> >>>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>>>  code
> >>>>>>>>>
> >>>>>>>>>  at ReduceTask.java:1535.  I missed the fact that
it's in a
> >>>>>>>>>> IOException
> >>>>>>>>>>
> >>>>>>>>>>  catch
> >>>>>>>>>
> >>>>>>>>>  block.  My mistake.  That's what I get for being
in a rush.
> >>>>>>>>>>
> >>>>>>>>>> For what it's worth I did re-run the job with
> >>>>>>>>>> mapred.reduce.parallel.copies set with values
from 5 all the way
> >>>>>>>>>> down to
> >>>>>>>>>>
> >>>>>>>>>>  1.
> >>>>>>>>>
> >>>>>>>>>  All failed with the same error:
> >>>>>>>>>>
> >>>>>>>>>> Error: java.lang.OutOfMemoryError: Java heap
space
> >>>>>>>>>>   at
> >>>>>>>>>>
> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
> >>>>>>>>>>
> >>>>>>>>> (ReduceTask.java:1195)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> So from that it does seem like something else
might be going on,
> >>>>>>>>>> yes?
> >>>>>>>>>>
> >>>>>>>>>>  I
> >>>>>>>>>
> >>>>>>>>>  need to do some more research.
> >>>>>>>>>>
> >>>>>>>>>> I appreciate your insights.
> >>>>>>>>>>
> >>>>>>>>>> Andy
> >>>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Ted Yu [mailto:yuzhihong@gmail.com]
> >>>>>>>>>> Sent: Sunday, March 07, 2010 3:38 PM
> >>>>>>>>>> To: common-user@hadoop.apache.org
> >>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
> >>>>>>>>>>
> >>>>>>>>>> My observation is based on this call chain:
> >>>>>>>>>> MapOutputCopier.run() calling copyOutput() calling
> getMapOutput()
> >>>>>>>>>> calling
> >>>>>>>>>> ramManager.canFitInMemory(decompressedLength)
> >>>>>>>>>>
> >>>>>>>>>> Basically ramManager.canFitInMemory() makes
decision without
> >>>>>>>>>> considering
> >>>>>>>>>> the
> >>>>>>>>>> number of MapOutputCopiers that are running.
Thus 1.25 * 0.7 of
> >>>>>>>>>> total
> >>>>>>>>>>
> >>>>>>>>>>  heap
> >>>>>>>>>
> >>>>>>>>>  may be used in shuffling if default parameters
were used.
> >>>>>>>>>> Of course, you should check the value for
> >>>>>>>>>> mapred.reduce.parallel.copies
> >>>>>>>>>>
> >>>>>>>>>>  to
> >>>>>>>>>
> >>>>>>>>>  see if it is 5. If it is 4 or lower, my reasoning
wouldn't
> apply.
> >>>>>>>>>>
> >>>>>>>>>> About ramManager.unreserve() call, ReduceTask.java
from hadoop
> >>>>>>>>>> 0.20.2
> >>>>>>>>>>
> >>>>>>>>>>  only
> >>>>>>>>>
> >>>>>>>>>  has 2731 lines. So I have to guess the location
of the code
> >>>>>>>>>> snippet you
> >>>>>>>>>> provided.
> >>>>>>>>>> I found this around line 1535:
> >>>>>>>>>>   } catch (IOException ioe) {
> >>>>>>>>>>     LOG.info("Failed to shuffle from " +
> >>>>>>>>>> mapOutputLoc.getTaskAttemptId(),
> >>>>>>>>>>              ioe);
> >>>>>>>>>>
> >>>>>>>>>>     // Inform the ram-manager
> >>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
> >>>>>>>>>>     ramManager.unreserve(mapOutputLength);
> >>>>>>>>>>
> >>>>>>>>>>     // Discard the map-output
> >>>>>>>>>>     try {
> >>>>>>>>>>       mapOutput.discard();
> >>>>>>>>>>     } catch (IOException ignored) {
> >>>>>>>>>>       LOG.info("Failed to discard map-output
from " +
> >>>>>>>>>>                mapOutputLoc.getTaskAttemptId(),
ignored);
> >>>>>>>>>>     }
> >>>>>>>>>> Please confirm the line number.
> >>>>>>>>>>
> >>>>>>>>>> If we're looking at the same code, I am afraid
I don't see how
> we
> >>>>>>>>>> can
> >>>>>>>>>> improve it. First, I assume IOException shouldn't
happen that
> >>>>>>>>>> often.
> >>>>>>>>>> Second,
> >>>>>>>>>> mapOutput.discard() just sets:
> >>>>>>>>>>     data = null;
> >>>>>>>>>> for in memory case. Even if we call mapOutput.discard()
before
> >>>>>>>>>> ramManager.unreserve(), we don't know when GC
would kick in and
> >>>>>>>>>> make more
> >>>>>>>>>> memory available.
> >>>>>>>>>> Of course, given the large number of map outputs
in your system,
> it
> >>>>>>>>>>
> >>>>>>>>>>  became
> >>>>>>>>>
> >>>>>>>>>  more likely that the root cause from my reasoning
made OOME
> happen
> >>>>>>>>>>
> >>>>>>>>>>  sooner.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  On Sun, Mar 7, 2010 at 1:03 PM, Andy Sautins
<
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  andy.sautins@returnpath.net
> >>>>>>>>>
> >>>>>>>>>  wrote:
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  Ted,
> >>>>>>>>>>>
> >>>>>>>>>>> I'm trying to follow the logic in your mail
and I'm not sure
> I'm
> >>>>>>>>>>> following.  If you would mind helping me
understand I would
> >>>>>>>>>>> appreciate
> >>>>>>>>>>>
> >>>>>>>>>>>  it.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Looking at the code maxSingleShuffleLimit
is only used in
> >>>>>>>>>>> determining
> >>>>>>>>>>>
> >>>>>>>>>>>  if
> >>>>>>>>>>
> >>>>>>>>>>  the copy _can_ fit into memory:
> >>>>>>>>>>>
> >>>>>>>>>>> boolean canFitInMemory(long requestedSize)
{
> >>>>>>>>>>>   return (requestedSize < Integer.MAX_VALUE
&&
> >>>>>>>>>>>           requestedSize < maxSingleShuffleLimit);
> >>>>>>>>>>>  }
> >>>>>>>>>>>
> >>>>>>>>>>> It also looks like the RamManager.reserve
should wait until
> >>>>>>>>>>> memory
> >>>>>>>>>>>
> >>>>>>>>>>>  is
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  available so it should hit a memory limit for that
reason.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> What does seem a little strange to me is
the following (
> >>>>>>>>>>>
> >>>>>>>>>>>  ReduceTask.java
> >>>>>>>>>>
> >>>>>>>>>>  starting at 2730 ):
> >>>>>>>>>>>
> >>>>>>>>>>>     // Inform the ram-manager
> >>>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
> >>>>>>>>>>>     ramManager.unreserve(mapOutputLength);
> >>>>>>>>>>>
> >>>>>>>>>>>     // Discard the map-output
> >>>>>>>>>>>     try {
> >>>>>>>>>>>       mapOutput.discard();
> >>>>>>>>>>>     } catch (IOException ignored) {
> >>>>>>>>>>>       LOG.info("Failed to discard map-output
from " +
> >>>>>>>>>>>                mapOutputLoc.getTaskAttemptId(),
ignored);
> >>>>>>>>>>>     }
> >>>>>>>>>>>     mapOutput = null;
> >>>>>>>>>>>
> >>>>>>>>>>> So to me that looks like the ramManager
unreserves the memory
> >>>>>>>>>>> before
> >>>>>>>>>>>
> >>>>>>>>>>>  the
> >>>>>>>>>>
> >>>>>>>>>>  mapOutput is discarded.  Shouldn't the mapOutput
be discarded
> >>>>>>>>>>> _before_
> >>>>>>>>>>>
> >>>>>>>>>>>  the
> >>>>>>>>>>
> >>>>>>>>>>  ramManager unreserves the memory?  If the memory
is unreserved
> >>>>>>>>>>> before
> >>>>>>>>>>>
> >>>>>>>>>>>  the
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  actual underlying data references are removed then
it seems like
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>  another
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  thread can try to allocate memory ( ReduceTask.java:2730
)
> before
> >>>>>>>>>>
> >>>>>>>>>>> the
> >>>>>>>>>>> previous memory is disposed ( mapOutput.discard()
).
> >>>>>>>>>>>
> >>>>>>>>>>> Not sure that makes sense.  One thing to
note is that the
> >>>>>>>>>>> particular
> >>>>>>>>>>>
> >>>>>>>>>>>  job
> >>>>>>>>>>
> >>>>>>>>>>  that is failing does have a good number ( 200k+
) of map
> >>>>>>>>>>> outputs.  The
> >>>>>>>>>>>
> >>>>>>>>>>>  large
> >>>>>>>>>>
> >>>>>>>>>>  number of small map outputs may be why we are
triggering a
> >>>>>>>>>>> problem.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks again for your thoughts.
> >>>>>>>>>>>
> >>>>>>>>>>> Andy
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: Jacob R Rideout [mailto:apache@jacobrideout.net]
> >>>>>>>>>>> Sent: Sunday, March 07, 2010 1:21 PM
> >>>>>>>>>>> To: common-user@hadoop.apache.org
> >>>>>>>>>>> Cc: Andy Sautins; Ted Yu
> >>>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
> >>>>>>>>>>>
> >>>>>>>>>>> Ted,
> >>>>>>>>>>>
> >>>>>>>>>>> Thank you. I filled MAPREDUCE-1571 to cover
this issue. I might
> >>>>>>>>>>> have
> >>>>>>>>>>> some time to write a patch later this week.
> >>>>>>>>>>>
> >>>>>>>>>>> Jacob Rideout
> >>>>>>>>>>>
> >>>>>>>>>>> On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu
<yuzhihong@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>  I think there is mismatch (in ReduceTask.java)
between:
> >>>>>>>>>>>>  this.numCopiers =
> conf.getInt("mapred.reduce.parallel.copies",
> >>>>>>>>>>>>
> >>>>>>>>>>>>  5);
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>  and:
> >>>>>>>>>>
> >>>>>>>>>>>   maxSingleShuffleLimit = (long)(maxSize
*
> >>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
> >>>>>>>>>>>> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION
is 0.25f
> >>>>>>>>>>>>
> >>>>>>>>>>>> because
> >>>>>>>>>>>>  copiers = new ArrayList<MapOutputCopier>(numCopiers);
> >>>>>>>>>>>> so the total memory allocated for in-mem
shuffle is 1.25 *
> >>>>>>>>>>>> maxSize
> >>>>>>>>>>>>
> >>>>>>>>>>>> A JIRA should be filed to correlate
the constant 5 above and
> >>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Sat, Mar 6, 2010 at 8:31 AM, Jacob
R Rideout <
> >>>>>>>>>>>>
> >>>>>>>>>>>>  apache@jacobrideout.net
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We are seeing the following error
in our reducers of a
> >>>>>>>>>>>>> particular
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  job:
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>  Error: java.lang.OutOfMemoryError: Java heap
space
> >>>>>>>>>>>>>   at
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
> >>>>>>>>>>
> >>>>>>>>> (ReduceTask.java:1195)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>> After enough reducers fail the entire
job fails. This error
> >>>>>>>>>>>>> occurs
> >>>>>>>>>>>>> regardless of whether mapred.compress.map.output
is true. We
> >>>>>>>>>>>>> were
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  able
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>  to avoid the issue by reducing
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>  mapred.job.shuffle.input.buffer.percent
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>  to 20%. Shouldn't the framework via
> >>>>>>>>>>
> >>>>>>>>>>>  ShuffleRamManager.canFitInMemory
> >>>>>>>>>>>>> and.ShuffleRamManager.reserve correctly
detect the the memory
> >>>>>>>>>>>>> available for allocation? I would
think that with poor
> >>>>>>>>>>>>> configuration
> >>>>>>>>>>>>> settings (and default settings in
particular) the job may not
> >>>>>>>>>>>>> be as
> >>>>>>>>>>>>> efficient, but wouldn't die.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Here is some more context in the
logs, I have attached the
> full
> >>>>>>>>>>>>> reducer log here: http://gist.github.com/323746
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2010-03-06 07:54:49,621 INFO
> >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
> >>>>>>>>>>>>> Shuffling 4191933 bytes (435311
raw bytes) into RAM from
> >>>>>>>>>>>>> attempt_201003060739_0002_m_000061_0
> >>>>>>>>>>>>> 2010-03-06 07:54:50,222 INFO
> >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  Task
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>  attempt_201003060739_0002_r_000000_0: Failed fetch
#1 from
> >>>>>>>>>>
> >>>>>>>>>>>  attempt_201003060739_0002_m_000202_0
> >>>>>>>>>>>>> 2010-03-06 07:54:50,223 WARN
> >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
> >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0
adding host
> >>>>>>>>>>>>> hd37.dfs.returnpath.net to penalty
box, next contact in 4
> >>>>>>>>>>>>> seconds
> >>>>>>>>>>>>> 2010-03-06 07:54:50,223 INFO
> >>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
> >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0:
Got 1 map-outputs from
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  previous
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>  failures
> >>>>>>>>>>
> >>>>>>>>>>>  2010-03-06 07:54:50,223 FATAL
> >>>>>>>>>>>>> org.apache.hadoop.mapred.TaskRunner:
> >>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0
: Map output copy
> failure :
> >>>>>>>>>>>>> java.lang.OutOfMemoryError: Java
heap space
> >>>>>>>>>>>>>   at
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
> >>>>>>>>>>
> >>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >>>>>>>>>
> >>>>>>>>>    at
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
> >>>>>>>>>>
> >>>>>>>>> (ReduceTask.java:1195)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>> We tried this both in 0.20.1 and
0.20.2. We had hoped
> >>>>>>>>>>>>> MAPREDUCE-1182
> >>>>>>>>>>>>> would address the issue in 0.20.2,
but it did not. Does
> anyone
> >>>>>>>>>>>>> have
> >>>>>>>>>>>>> any comments or suggestions? Is
this a bug I should file a
> JIRA
> >>>>>>>>>>>>> for?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jacob Rideout
> >>>>>>>>>>>>> Return Path
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message