hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4890) Invalid TaskImpl state transitions when task fails while speculating
Date Sat, 22 Dec 2012 12:51:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13538773#comment-13538773

Hudson commented on MAPREDUCE-4890:

Integrated in Hadoop-Hdfs-0.23-Build #471 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/471/])
    svn merge -c 1425223 FIXES: MAPREDUCE-4890. Invalid TaskImpl state transitions when task
fails while speculating. Contributed by Jason Lowe (Revision 1425227)

     Result = UNSTABLE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425227
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java

> Invalid TaskImpl state transitions when task fails while speculating
> --------------------------------------------------------------------
>                 Key: MAPREDUCE-4890
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4890
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.2-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>             Fix For: 2.0.3-alpha, 0.23.6
>         Attachments: MAPREDUCE-4890.patch
> There are a couple of issues when a task fails while speculating (i.e.: multiple attempts
are active):
> # The other active attempts are not killed.
> # TaskImpl's FAILED state does not handle the T_ATTEMPT_* set of events which can be
sent from the other active attempts.  These all need to be handled since they can be sent
asynchronously from the other active task attempts.
> Failure to handle this properly means jobs that are configured to normally tolerate failures
via mapreduce.map.failures.maxpercent or mapreduce.reduce.failures.maxpercent and also speculate
can easily end up failing due to invalid state transitions rather than complete successfully
with a few explicitly allowed task failures.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message