hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry
Date Fri, 22 Aug 2014 16:12:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107023#comment-14107023

Jason Lowe commented on MAPREDUCE-5956:

When it comes to leaking staging directories, there are far more common cases where that occurs
than this scenario. e.g.: application killed before AM starts or in-between AM retries, AM
is misconfigured and fails every time, etc.  It seems like the scenario we're worried about
is highly unlikely, so I don't think it'd be a big deal to put into 2.5.1 from that standpoint.
  IIRC the problem being fixed here isn't an issue unless preemption is a factor.  If we're
officially supporting preemption in 2.5 then I think this is a good candidate to consider
for 2.5.1, otherwise it can wait until 2.6.

> MapReduce AM should not use maxAttempts to determine if this is the last retry
> ------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster, mrv2
>    Affects Versions: 2.4.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Wangda Tan
>            Priority: Blocker
>             Fix For: 2.6.0
>         Attachments: MR-5956.patch, MR-5956.patch
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we don't count
AM preemption towards AM failures on RM side, but MapReduce AM itself checks the attempt id
against the max-attempt count to determine if this is the last attempt.
> {code}
>     public void computeIsLastAMRetry() {
>       isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
>     }
> {code}
> This causes issues w.r.t deletion of staging directory etc..

This message was sent by Atlassian JIRA

View raw message