hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2074) Preemption of AM containers shouldn't count towards AM failures
Date Wed, 21 May 2014 23:18:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005370#comment-14005370

Xuan Gong commented on YARN-2074:

1. {code}
    RMAppAttempt attempt =
        new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService,
          submissionContext, conf, maxAppAttempts <= attempts.size());
Use this condition to decide whether this RMAppAttempt is isLastAttempt, does not sound right
to me. 
For example, we set the maxAppAttempts as 3, but previous 2 AM is preempted, based on the
condition you set here, the next RMAppAttempt is the lastAttempt ?? If this Attempt is failed,
the whole application will be marked as failure. 

2. {code}
  public boolean isPreempted() {
    return getDiagnostics().contains(SchedulerUtils.PREEMPTED_CONTAINER);
It is fine to use this to check  isPreempted. But, link https://issues.apache.org/jira/browse/YARN-614,
basically, this ticket is saying we should separate hardware failures or YARN issues from
AM failure, and do not count them as AM failure. I think that the Preemption of AM is one
of them. So, maybe we could use a more general way to check whether the AM is isPreempted,
(check ContainerExitStatus instead ?)

> Preemption of AM containers shouldn't count towards AM failures
> ---------------------------------------------------------------
>                 Key: YARN-2074
>                 URL: https://issues.apache.org/jira/browse/YARN-2074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Jian He
>         Attachments: YARN-2074.1.patch, YARN-2074.2.patch
> One orthogonal concern with issues like YARN-2055 and YARN-2022 is that AM containers
getting preempted shouldn't count towards AM failures and thus shouldn't eventually fail applications.
> We should explicitly handle AM container preemption/kill as a separate issue and not
count it towards the limit on AM failures.

This message was sent by Atlassian JIRA

View raw message