hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched
Date Thu, 09 Dec 2010 00:20:03 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969572#action_12969572
] 

Joydeep Sen Sarma commented on MAPREDUCE-2214:
----------------------------------------------

i think what happened in our case was something like this:
# task was requested to be killed
# the TT performed the kill action and reported back to the JT
# but the task reported back as done - at which point the TT promptly moved it into the SUCCEEDED
state
# meanwhile the JT scheduled a cleanup and the cleanup failed to launch without returning
the slot

the cris-crossing of #2 and #3 was what was unexpected i think (something the code doesn't
anticipate). 

we don't hit this problem with speculation because we never request speculation when the task
is about to complete (there's a check on the remaining time on the task and if the remaining
time is less than N min - we don't speculate. there's a jira for this - don't remember which).

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected
state. However, in the case where the task is not launched, the slot is not released. We have
observed this in production - the task was in SUCCEEDED state by the time launchTask() got
to it and then the slot was never released. It is not clear how the task got into that state,
but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message