hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure
Date Thu, 24 Jan 2013 23:51:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562142#comment-13562142
] 

Jason Lowe commented on MAPREDUCE-4951:
---------------------------------------

Agree that solving MAPREDUCE-4955 is separate, sorry for the extra noise.  I just wanted to
point out that even with this patch there will still be spurious failures if the task notifies
the AM before the AM sees the container status from the RM.
                
> Container preemption interpreted as task failure
> ------------------------------------------------
>
>                 Key: MAPREDUCE-4951
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, mr-am, mrv2
>    Affects Versions: 2.0.2-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, MAPREDUCE-4951.patch
>
>
> When YARN reports a completed container to the MR AM, it always interprets it as a failure.
 This can lead to a job failing because too many of its tasks failed, when in fact they only
failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -100 and interpret it as a container
being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message