hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5900) Container preemption interpreted as task failures and eventually job failures
Date Tue, 01 Jul 2014 08:40:25 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048642#comment-14048642

Zhijie Shen commented on MAPREDUCE-5900:

Looks good to me overall. Just some minor suggestions:

1. No need to duplicate most of the code to create a "testPreemptedContainerEvent". How about
changing the class name and adding the abortedEvent logic.

2. Maybe it's better to instantiate TaskAttemptKillEvent object here to and to put diagnostics
into the message field, differentiating the killing and preemption root cause.
       // killed by framework
       return new TaskAttemptEvent(attemptID,

3. Personally, I don't think the newly added test cases in TestTaskAttempt is relevant to
this issue, because they're actually testing transitions on a kill event, but the problem
is to interpret preemption into a kill event. However, it's no harm to have the tests because
we previously haven't covered most of the task attempt transitions. One suggestion here is
to rename "testContainerPreempted..." to "testKill..."

> Container preemption interpreted as task failures and eventually job failures 
> ------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5900
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5900
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster, mr-am, mrv2
>    Affects Versions: 2.4.1
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>         Attachments: MAPREDUCE-5900-1.patch, MAPREDUCE-5900-branch-241-2.patch, MAPREDUCE-5900-trunk-1.patch,
> We have Added preemption exit code needs to be incorporated
> MR needs to recognize the special exit code value of -102 and interpret it as a container
being killed instead of a container failure.

This message was sent by Atlassian JIRA

View raw message