hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5956) MapReduce AM should not use maxAttempts to determine if this is the last retry
Date Tue, 08 Jul 2014 20:12:06 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055422#comment-14055422
] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5956:
----------------------------------------------------

(AMRMClient|AMRMClientAsync).unregisterApplicationMaster() is a blocking call. If any attempt
calls this API, and it succeeds, this AM is the last retry - the AM can go ahead and do its
cleanup. All other attempts (which either don't call this API or which failed before the API
returned) do not need to do any cleanup - of course there are corner cases where this is not
sufficient.

For that and all the failing cases, the only comprehensive solution I can think of is YARN-2261.

> MapReduce AM should not use maxAttempts to determine if this is the last retry
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5956
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster, mrv2
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Wangda Tan
>            Priority: Blocker
>
> Found this while reviewing YARN-2074. The problem is that after YARN-2074, we don't count
AM preemption towards AM failures on RM side, but MapReduce AM itself checks the attempt id
against the max-attempt count to determine if this is the last attempt.
> {code}
>     public void computeIsLastAMRetry() {
>       isLastAMRetry = appAttemptID.getAttemptId() >= maxAppAttempts;
>     }
> {code}
> This causes issues w.r.t deletion of staging directory etc..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message