hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-243) Job Client doesn't give progress for Application Master Retries
Date Mon, 26 Nov 2012 16:58:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503896#comment-13503896
] 

Jason Lowe commented on YARN-243:
---------------------------------

That doesn't sound like something to fix on the client side.  If the AM told the client that
the job failed then the job should have failed.  The fact that the attempt died between the
time it told the client the job final status and the RM can happen, and IMHO we should fix
things so the subsequent AM attempt doesn't retry the job but rather simply updates the RM
with the failed status found from the previous attempt.  Otherwise we run into bad situations
where we've already told the client the job failed, but the job subsequently retries (possibly
from scratch, depending upon the output format support for recovery) and could succeed.  If
the job has decided to fail and has already told the client, an AM attempt failure while trying
to report that same decision to the RM shouldn't allow the job to subsequently succeed, IMHO.
                
> Job Client doesn't give progress for Application Master Retries
> ---------------------------------------------------------------
>
>                 Key: YARN-243
>                 URL: https://issues.apache.org/jira/browse/YARN-243
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client, resourcemanager
>    Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>
> If we configure the AM retries, if the first attempt fails then RM will create next attempt
but Job Client doesn't give the progress for the retry attempts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message