hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2177) The wait for spill completion should call Condition.awaitNanos(long nanosTimeout)
Date Mon, 08 Nov 2010 09:04:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929512#action_12929512
] 

Chris Douglas commented on MAPREDUCE-2177:
------------------------------------------

{quote}SpillThread doesn't currently have reference to TaskReporter.
It is easier to use short timeout for spillDone.awaitNanos() so that Buffer.write() can call
progress().{quote}

That prevents the task from being killed, but its semantics are incorrect. Todd's suggestion-
calling progress() during the merge- at least ensures that the task is doing work; reporting
progress from a thread that isn't actually proceeding is broken. Isn't progress already reported
during the merge? Can you provide more detail on the environment where you're observing this?

> The wait for spill completion should call Condition.awaitNanos(long nanosTimeout)
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2177
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2177
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Ted Yu
>
> We sometimes saw maptask timeout in cdh3b2. Here is log from one of the maptasks:
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer
full= true
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: bufstart = 119534169;
bufend = 59763857; bufvoid = 298844160
> 2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: kvstart = 438913; kvend
= 585320; length = 983040
> 2010-11-04 10:34:41,615 INFO org.apache.hadoop.mapred.MapTask: Finished spill 3
> 2010-11-04 10:35:45,352 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: buffer
full= true
> 2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: bufstart = 59763857; bufend
= 298837899; bufvoid = 298844160
> 2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: kvstart = 585320; kvend
= 731585; length = 983040
> 2010-11-04 10:45:41,289 INFO org.apache.hadoop.mapred.MapTask: Finished spill 4
> Note how long the last spill took.
> In MapTask.java, the following code waits for spill to finish:
> while (kvstart != kvend) { reporter.progress(); spillDone.await(); }
> In trunk code, code is similar.
> There is no timeout mechanism for Condition.await(). In case the SpillThread takes long
before calling spillDone.signal(), we would see timeout.
> Condition.awaitNanos(long nanosTimeout) should be called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message