hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Shuffle In Memory OutOfMemoryError
Date Thu, 11 Mar 2010 03:54:00 GMT
I pressed send key a bit early.

I will have to dig a bit deeper.
Hopefully someone can find reader.close() call after which I will look for
another possible root cause :-)


On Wed, Mar 10, 2010 at 7:48 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Thanks to Andy for the log he provided.
>
> You can see from the log below that size increased steadily from 341535057
> to 408181692, approaching maxSize. Then OOME:
>
>
> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
> pos=start requestedSize=3893000 size=341535057 numPendingRequests=0
> maxSize=417601952
> 2010-03-10 18:38:32,936 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
> pos=end requestedSize=3893000 size=345428057 numPendingRequests=0
> maxSize=417601952
> ...
> 2010-03-10 18:38:35,950 INFO org.apache.hadoop.mapred.ReduceTask: reserve:
> pos=end requestedSize=635753 size=408181692 numPendingRequests=0
> maxSize=417601952
> 2010-03-10 18:38:36,603 INFO org.apache.hadoop.mapred.ReduceTask: Task
> attempt_201003101826_0001_r_000004_0: Failed fetch #1 from
> attempt_201003101826_0001_m_000875_0
>
> 2010-03-10 18:38:36,603 WARN org.apache.hadoop.mapred.ReduceTask:
> attempt_201003101826_0001_r_000004_0 adding host hd17.dfs.returnpath.netto penalty box,
next contact in 4 seconds
> 2010-03-10 18:38:36,604 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_201003101826_0001_r_000004_0: Got 1 map-outputs from previous
> failures
> 2010-03-10 18:38:36,605 FATAL org.apache.hadoop.mapred.TaskRunner:
> attempt_201003101826_0001_r_000004_0 : Map output copy failure :
> java.lang.OutOfMemoryError: Java heap space
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1513)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1413)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1266)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1200)
>
> Looking at the call to unreserve() in ReduceTask, two were for IOException
> and the other was for Sanity check (line 1557). Meaning they wouldn't be
> called in normal execution path.
>
> I see one call in IFile.InMemoryReader close() method:
>       // Inform the RamManager
>       ramManager.unreserve(bufferSize);
>
> And InMemoryReader is used in createInMemorySegments():
>           Reader<K, V> reader =
>             new InMemoryReader<K, V>(ramManager, mo.mapAttemptId,
>                                      mo.data, 0, mo.data.length);
>
> But I don't see reader.close() in ReduceTask file.
>


> On Wed, Mar 10, 2010 at 3:34 PM, Chris Douglas <chrisdo@yahoo-inc.com>wrote:
>
>> I don't think this OOM is a framework bug per se, and given the
>> rewrite/refactoring of the shuffle in MAPREDUCE-318 (in 0.21), tuning the
>> 0.20 shuffle semantics is likely not worthwhile (though data informing
>> improvements to trunk would be excellent). Most likely (and tautologically),
>> ReduceTask simply requires more memory than is available and the job failure
>> can be avoided by either 0) increasing the heap size or 1) lowering
>> mapred.shuffle.input.buffer.percent. Most of the tasks we run have a heap of
>> 1GB. For a reduce fetching >200k map outputs, that's a reasonable, even
>> stingy amount of space. -C
>>
>>
>> On Mar 10, 2010, at 5:26 AM, Ted Yu wrote:
>>
>>  I verified that size and maxSize are long. This means MR-1182 didn't
>>> resolve
>>> Andy's issue.
>>>
>>> According to Andy:
>>> At the beginning of the job there are 209,754 pending map tasks and 32
>>> pending reduce tasks
>>>
>>> My guess is that GC wasn't reclaiming memory fast enough, leading to OOME
>>> because of large number of in-memory shuffle candidates.
>>>
>>> My suggestion for Andy would be to:
>>> 1. add -*verbose*:*gc as JVM parameter
>>> 2. modify reserve() slightly to calculate the maximum outstanding
>>> numPendingRequests and print the maximum.
>>>
>>> Based on the output from above two items, we can discuss solution.
>>> My intuition is to place upperbound on numPendingRequests beyond which
>>> canFitInMemory() returns false.
>>> *
>>> My two cents.
>>>
>>> On Tue, Mar 9, 2010 at 11:51 PM, Christopher Douglas
>>> <chrisdo@yahoo-inc.com>wrote:
>>>
>>>  That section of code is unmodified in MR-1182. See the patches/svn log.
>>>> -C
>>>>
>>>> Sent from my iPhone
>>>>
>>>>
>>>> On Mar 9, 2010, at 7:44 PM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>>>>
>>>> I just downloaded hadoop-0.20.2 tar ball from cloudera mirror.
>>>>
>>>>> This is what I see in ReduceTask (line 999):
>>>>>   public synchronized boolean reserve(int requestedSize, InputStream
>>>>> in)
>>>>>
>>>>>   throws InterruptedException {
>>>>>     // Wait till the request can be fulfilled...
>>>>>     while ((size + requestedSize) > maxSize) {
>>>>>
>>>>> I don't see the fix from MR-1182.
>>>>>
>>>>> That's why I suggested to Andy that he manually apply MR-1182.
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Tue, Mar 9, 2010 at 5:01 PM, Andy Sautins <
>>>>> andy.sautins@returnpath.net
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>
>>>>>
>>>>>  Thanks Christopher.
>>>>>>
>>>>>> The heap size for reduce tasks is configured to be 640M (
>>>>>> mapred.child.java.opts set to -Xmx640m ).
>>>>>>
>>>>>> Andy
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Christopher Douglas [mailto:chrisdo@yahoo-inc.com]
>>>>>> Sent: Tuesday, March 09, 2010 5:19 PM
>>>>>> To: common-user@hadoop.apache.org
>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>
>>>>>> No, MR-1182 is included in 0.20.2
>>>>>>
>>>>>> What heap size have you set for your reduce tasks? -C
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On Mar 9, 2010, at 2:34 PM, "Ted Yu" <yuzhihong@gmail.com>
wrote:
>>>>>>
>>>>>> Andy:
>>>>>>
>>>>>>> You need to manually apply the patch.
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins <
>>>>>>>
>>>>>>>  andy.sautins@returnpath.net
>>>>>>
>>>>>>  wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>  Thanks Ted.  My understanding is that MAPREDUCE-1182 is included
>>>>>>>> in the
>>>>>>>> 0.20.2 release.  We upgraded our cluster to 0.20.2 this weekend
and
>>>>>>>> re-ran
>>>>>>>> the same job scenarios.  Running with mapred.reduce.parallel.copies
>>>>>>>> set to 1
>>>>>>>> and continue to have the same Java heap space error.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ted Yu [mailto:yuzhihong@gmail.com]
>>>>>>>> Sent: Tuesday, March 09, 2010 12:56 PM
>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>
>>>>>>>> This issue has been resolved in
>>>>>>>> http://issues.apache.org/jira/browse/MAPREDUCE-1182
>>>>>>>>
>>>>>>>> Please apply the patch
>>>>>>>> M1182-1v20.patch<
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>> http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>  On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins <
>>>>>>>>
>>>>>>>>  andy.sautins@returnpath.net
>>>>>>>
>>>>>>
>>>>>>  wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>  Thanks Ted.  Very helpful.  You are correct that I misunderstood
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>  code
>>>>>>>>
>>>>>>>>  at ReduceTask.java:1535.  I missed the fact that it's in
a
>>>>>>>>> IOException
>>>>>>>>>
>>>>>>>>>  catch
>>>>>>>>
>>>>>>>>  block.  My mistake.  That's what I get for being in a rush.
>>>>>>>>>
>>>>>>>>> For what it's worth I did re-run the job with
>>>>>>>>> mapred.reduce.parallel.copies set with values from 5
all the way
>>>>>>>>> down to
>>>>>>>>>
>>>>>>>>>  1.
>>>>>>>>
>>>>>>>>  All failed with the same error:
>>>>>>>>>
>>>>>>>>> Error: java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>   at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>
>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So from that it does seem like something else might be
going on,
>>>>>>>>> yes?
>>>>>>>>>
>>>>>>>>>  I
>>>>>>>>
>>>>>>>>  need to do some more research.
>>>>>>>>>
>>>>>>>>> I appreciate your insights.
>>>>>>>>>
>>>>>>>>> Andy
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Ted Yu [mailto:yuzhihong@gmail.com]
>>>>>>>>> Sent: Sunday, March 07, 2010 3:38 PM
>>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>>
>>>>>>>>> My observation is based on this call chain:
>>>>>>>>> MapOutputCopier.run() calling copyOutput() calling getMapOutput()
>>>>>>>>> calling
>>>>>>>>> ramManager.canFitInMemory(decompressedLength)
>>>>>>>>>
>>>>>>>>> Basically ramManager.canFitInMemory() makes decision
without
>>>>>>>>> considering
>>>>>>>>> the
>>>>>>>>> number of MapOutputCopiers that are running. Thus 1.25
* 0.7 of
>>>>>>>>> total
>>>>>>>>>
>>>>>>>>>  heap
>>>>>>>>
>>>>>>>>  may be used in shuffling if default parameters were used.
>>>>>>>>> Of course, you should check the value for
>>>>>>>>> mapred.reduce.parallel.copies
>>>>>>>>>
>>>>>>>>>  to
>>>>>>>>
>>>>>>>>  see if it is 5. If it is 4 or lower, my reasoning wouldn't
apply.
>>>>>>>>>
>>>>>>>>> About ramManager.unreserve() call, ReduceTask.java from
hadoop
>>>>>>>>> 0.20.2
>>>>>>>>>
>>>>>>>>>  only
>>>>>>>>
>>>>>>>>  has 2731 lines. So I have to guess the location of the code
>>>>>>>>> snippet you
>>>>>>>>> provided.
>>>>>>>>> I found this around line 1535:
>>>>>>>>>   } catch (IOException ioe) {
>>>>>>>>>     LOG.info("Failed to shuffle from " +
>>>>>>>>> mapOutputLoc.getTaskAttemptId(),
>>>>>>>>>              ioe);
>>>>>>>>>
>>>>>>>>>     // Inform the ram-manager
>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
>>>>>>>>>     ramManager.unreserve(mapOutputLength);
>>>>>>>>>
>>>>>>>>>     // Discard the map-output
>>>>>>>>>     try {
>>>>>>>>>       mapOutput.discard();
>>>>>>>>>     } catch (IOException ignored) {
>>>>>>>>>       LOG.info("Failed to discard map-output from " +
>>>>>>>>>                mapOutputLoc.getTaskAttemptId(), ignored);
>>>>>>>>>     }
>>>>>>>>> Please confirm the line number.
>>>>>>>>>
>>>>>>>>> If we're looking at the same code, I am afraid I don't
see how we
>>>>>>>>> can
>>>>>>>>> improve it. First, I assume IOException shouldn't happen
that
>>>>>>>>> often.
>>>>>>>>> Second,
>>>>>>>>> mapOutput.discard() just sets:
>>>>>>>>>     data = null;
>>>>>>>>> for in memory case. Even if we call mapOutput.discard()
before
>>>>>>>>> ramManager.unreserve(), we don't know when GC would kick
in and
>>>>>>>>> make more
>>>>>>>>> memory available.
>>>>>>>>> Of course, given the large number of map outputs in your
system, it
>>>>>>>>>
>>>>>>>>>  became
>>>>>>>>
>>>>>>>>  more likely that the root cause from my reasoning made OOME
happen
>>>>>>>>>
>>>>>>>>>  sooner.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  On Sun, Mar 7, 2010 at 1:03 PM, Andy Sautins <
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  andy.sautins@returnpath.net
>>>>>>>>
>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Ted,
>>>>>>>>>>
>>>>>>>>>> I'm trying to follow the logic in your mail and I'm
not sure I'm
>>>>>>>>>> following.  If you would mind helping me understand
I would
>>>>>>>>>> appreciate
>>>>>>>>>>
>>>>>>>>>>  it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Looking at the code maxSingleShuffleLimit is only
used in
>>>>>>>>>> determining
>>>>>>>>>>
>>>>>>>>>>  if
>>>>>>>>>
>>>>>>>>>  the copy _can_ fit into memory:
>>>>>>>>>>
>>>>>>>>>> boolean canFitInMemory(long requestedSize) {
>>>>>>>>>>   return (requestedSize < Integer.MAX_VALUE &&
>>>>>>>>>>           requestedSize < maxSingleShuffleLimit);
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>> It also looks like the RamManager.reserve should
wait until
>>>>>>>>>> memory
>>>>>>>>>>
>>>>>>>>>>  is
>>>>>>>>>
>>>>>>>>
>>>>>>>>  available so it should hit a memory limit for that reason.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What does seem a little strange to me is the following
(
>>>>>>>>>>
>>>>>>>>>>  ReduceTask.java
>>>>>>>>>
>>>>>>>>>  starting at 2730 ):
>>>>>>>>>>
>>>>>>>>>>     // Inform the ram-manager
>>>>>>>>>>     ramManager.closeInMemoryFile(mapOutputLength);
>>>>>>>>>>     ramManager.unreserve(mapOutputLength);
>>>>>>>>>>
>>>>>>>>>>     // Discard the map-output
>>>>>>>>>>     try {
>>>>>>>>>>       mapOutput.discard();
>>>>>>>>>>     } catch (IOException ignored) {
>>>>>>>>>>       LOG.info("Failed to discard map-output from
" +
>>>>>>>>>>                mapOutputLoc.getTaskAttemptId(), ignored);
>>>>>>>>>>     }
>>>>>>>>>>     mapOutput = null;
>>>>>>>>>>
>>>>>>>>>> So to me that looks like the ramManager unreserves
the memory
>>>>>>>>>> before
>>>>>>>>>>
>>>>>>>>>>  the
>>>>>>>>>
>>>>>>>>>  mapOutput is discarded.  Shouldn't the mapOutput be
discarded
>>>>>>>>>> _before_
>>>>>>>>>>
>>>>>>>>>>  the
>>>>>>>>>
>>>>>>>>>  ramManager unreserves the memory?  If the memory is
unreserved
>>>>>>>>>> before
>>>>>>>>>>
>>>>>>>>>>  the
>>>>>>>>>
>>>>>>>>
>>>>>>>>  actual underlying data references are removed then it seems
like
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  another
>>>>>>>>>
>>>>>>>>
>>>>>>>>  thread can try to allocate memory ( ReduceTask.java:2730
) before
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>> previous memory is disposed ( mapOutput.discard()
).
>>>>>>>>>>
>>>>>>>>>> Not sure that makes sense.  One thing to note is
that the
>>>>>>>>>> particular
>>>>>>>>>>
>>>>>>>>>>  job
>>>>>>>>>
>>>>>>>>>  that is failing does have a good number ( 200k+ ) of
map
>>>>>>>>>> outputs.  The
>>>>>>>>>>
>>>>>>>>>>  large
>>>>>>>>>
>>>>>>>>>  number of small map outputs may be why we are triggering
a
>>>>>>>>>> problem.
>>>>>>>>>>
>>>>>>>>>> Thanks again for your thoughts.
>>>>>>>>>>
>>>>>>>>>> Andy
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Jacob R Rideout [mailto:apache@jacobrideout.net]
>>>>>>>>>> Sent: Sunday, March 07, 2010 1:21 PM
>>>>>>>>>> To: common-user@hadoop.apache.org
>>>>>>>>>> Cc: Andy Sautins; Ted Yu
>>>>>>>>>> Subject: Re: Shuffle In Memory OutOfMemoryError
>>>>>>>>>>
>>>>>>>>>> Ted,
>>>>>>>>>>
>>>>>>>>>> Thank you. I filled MAPREDUCE-1571 to cover this
issue. I might
>>>>>>>>>> have
>>>>>>>>>> some time to write a patch later this week.
>>>>>>>>>>
>>>>>>>>>> Jacob Rideout
>>>>>>>>>>
>>>>>>>>>> On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu <yuzhihong@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>  I think there is mismatch (in ReduceTask.java) between:
>>>>>>>>>>>  this.numCopiers = conf.getInt("mapred.reduce.parallel.copies",
>>>>>>>>>>>
>>>>>>>>>>>  5);
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>  and:
>>>>>>>>>
>>>>>>>>>>   maxSingleShuffleLimit = (long)(maxSize *
>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
>>>>>>>>>>> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is
0.25f
>>>>>>>>>>>
>>>>>>>>>>> because
>>>>>>>>>>>  copiers = new ArrayList<MapOutputCopier>(numCopiers);
>>>>>>>>>>> so the total memory allocated for in-mem shuffle
is 1.25 *
>>>>>>>>>>> maxSize
>>>>>>>>>>>
>>>>>>>>>>> A JIRA should be filed to correlate the constant
5 above and
>>>>>>>>>>> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION.
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Mar 6, 2010 at 8:31 AM, Jacob R Rideout
<
>>>>>>>>>>>
>>>>>>>>>>>  apache@jacobrideout.net
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> We are seeing the following error in our
reducers of a
>>>>>>>>>>>> particular
>>>>>>>>>>>>
>>>>>>>>>>>>  job:
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>>  Error: java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>>>>   at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>
>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> After enough reducers fail the entire job
fails. This error
>>>>>>>>>>>> occurs
>>>>>>>>>>>> regardless of whether mapred.compress.map.output
is true. We
>>>>>>>>>>>> were
>>>>>>>>>>>>
>>>>>>>>>>>>  able
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>  to avoid the issue by reducing
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>  mapred.job.shuffle.input.buffer.percent
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>  to 20%. Shouldn't the framework via
>>>>>>>>>
>>>>>>>>>>  ShuffleRamManager.canFitInMemory
>>>>>>>>>>>> and.ShuffleRamManager.reserve correctly detect
the the memory
>>>>>>>>>>>> available for allocation? I would think that
with poor
>>>>>>>>>>>> configuration
>>>>>>>>>>>> settings (and default settings in particular)
the job may not
>>>>>>>>>>>> be as
>>>>>>>>>>>> efficient, but wouldn't die.
>>>>>>>>>>>>
>>>>>>>>>>>> Here is some more context in the logs, I
have attached the full
>>>>>>>>>>>> reducer log here: http://gist.github.com/323746
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2010-03-06 07:54:49,621 INFO
>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>> Shuffling 4191933 bytes (435311 raw bytes)
into RAM from
>>>>>>>>>>>> attempt_201003060739_0002_m_000061_0
>>>>>>>>>>>> 2010-03-06 07:54:50,222 INFO
>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>>
>>>>>>>>>>>>  Task
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>  attempt_201003060739_0002_r_000000_0: Failed fetch #1 from
>>>>>>>>>
>>>>>>>>>>  attempt_201003060739_0002_m_000202_0
>>>>>>>>>>>> 2010-03-06 07:54:50,223 WARN
>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 adding
host
>>>>>>>>>>>> hd37.dfs.returnpath.net to penalty box, next
contact in 4
>>>>>>>>>>>> seconds
>>>>>>>>>>>> 2010-03-06 07:54:50,223 INFO
>>>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask:
>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0: Got
1 map-outputs from
>>>>>>>>>>>>
>>>>>>>>>>>>  previous
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>  failures
>>>>>>>>>
>>>>>>>>>>  2010-03-06 07:54:50,223 FATAL
>>>>>>>>>>>> org.apache.hadoop.mapred.TaskRunner:
>>>>>>>>>>>> attempt_201003060739_0002_r_000000_0 : Map
output copy failure :
>>>>>>>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>>>>>>>>   at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>  org.apache.hadoop.mapred.ReduceTask$ReduceCopier
>>>>>>>>>
>>>>>>>> $MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>>>>>>>
>>>>>>>>    at
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run
>>>>>>>>>
>>>>>>>> (ReduceTask.java:1195)
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> We tried this both in 0.20.1 and 0.20.2.
We had hoped
>>>>>>>>>>>> MAPREDUCE-1182
>>>>>>>>>>>> would address the issue in 0.20.2, but it
did not. Does anyone
>>>>>>>>>>>> have
>>>>>>>>>>>> any comments or suggestions? Is this a bug
I should file a JIRA
>>>>>>>>>>>> for?
>>>>>>>>>>>>
>>>>>>>>>>>> Jacob Rideout
>>>>>>>>>>>> Return Path
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message