hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-2177) The wait for spill completion should call Condition.awaitNanos(long nanosTimeout)
Date Sun, 07 Nov 2010 00:58:24 GMT
The wait for spill completion should call Condition.awaitNanos(long nanosTimeout)
---------------------------------------------------------------------------------

                 Key: MAPREDUCE-2177
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2177
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.2
            Reporter: Ted Yu


We sometimes saw maptask timeout in cdh3b2. Here is log from one of the maptasks:

2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer
full= true
2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: bufstart = 119534169; bufend
= 59763857; bufvoid = 298844160

2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: kvstart = 438913; kvend = 585320;
length = 983040
2010-11-04 10:34:41,615 INFO org.apache.hadoop.mapred.MapTask: Finished spill 3
2010-11-04 10:35:45,352 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer
full= true

2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: bufstart = 59763857; bufend
= 298837899; bufvoid = 298844160
2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: kvstart = 585320; kvend = 731585;
length = 983040

2010-11-04 10:45:41,289 INFO org.apache.hadoop.mapred.MapTask: Finished spill 4

Note how long the last spill took.

In MapTask.java, the following code waits for spill to finish:
while (kvstart != kvend) { reporter.progress(); spillDone.await(); }

In trunk code, code is similar.

There is no timeout mechanism for Condition.await(). In case the SpillThread takes long before
calling spillDone.signal(), we would see timeout.
Condition.awaitNanos(long nanosTimeout) should be called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message