hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5000) TaskImpl.getCounters() can return the counters for the wrong task attempt when task is speculating
Date Wed, 13 Feb 2013 00:23:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577210#comment-13577210

Hadoop QA commented on MAPREDUCE-5000:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3329//console

This message is automatically generated.
> TaskImpl.getCounters() can return the counters for the wrong task attempt when task is
> --------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5000
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5000
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 0.23.6
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: MAPREDUCE-5000-branch-0.23.patch, MAPREDUCE-5000.patch
> When a task is speculating and one attempt completes then sometimes the counters for
the wrong attempt are aggregated into the total counters for the job.  The scenario looks
like this:
> # Two task attempts are racing, _0 and _1
> # _1 finishes first, causing the task to issue a TA_KILL to attempt _0
> # _0 receives TA_KILL, sets progress to 1.0f and waits for container cleanup
> # if TaskImpl.getCounters() is called now, TaskImpl.selectBestAttempt() can return _0
since it is not quite yet in the KILLED state yet progress is maxed out and no other attempt
has more progress.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message