tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hitesh Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-3426) Second AM attempt launched for session mode and recovery disabled for certain cases
Date Tue, 06 Sep 2016 20:46:21 GMT

    [ https://issues.apache.org/jira/browse/TEZ-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468513#comment-15468513
] 

Hitesh Shah commented on TEZ-3426:
----------------------------------

+1

> Second AM attempt launched for session mode and recovery disabled for certain cases
> -----------------------------------------------------------------------------------
>
>                 Key: TEZ-3426
>                 URL: https://issues.apache.org/jira/browse/TEZ-3426
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: TEZ-3426.001.patch, TEZ-3426.002.patch, TEZ-3426.003.patch, TEZ-3426.004.patch
>
>
> ApplicationSubmissionContext#setMaxAppAttempts does not fully guarantee that there will
be only that many attempts at maximum. There are a few exceptional cases that are not count.
Tez should protect itself from accidentally starting the second attempt in session mode and
when recovery is disabled since the second attempt will always succeed with no work to do.
> {code}
>   @Override
>   public boolean shouldCountTowardsMaxAttemptRetry() {
>     try {
>       this.readLock.lock();
>       int exitStatus = getAMContainerExitStatus();
>       return !(exitStatus == ContainerExitStatus.PREEMPTED
>           || exitStatus == ContainerExitStatus.ABORTED
>           || exitStatus == ContainerExitStatus.DISKS_FAILED
>           || exitStatus == ContainerExitStatus.KILLED_BY_RESOURCEMANAGER);
>     } finally {
>       this.readLock.unlock();
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message