hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-614) Separate AM failures from hardware failure or YARN error and do not count them to AM retry count
Date Sun, 29 Jun 2014 11:15:27 GMT

    [ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047096#comment-14047096
] 

Hudson commented on YARN-614:
-----------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk #598 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/598/])
YARN-614. Changed ResourceManager to not count disk failure, node loss and RM restart towards
app failures. Contributed by Xuan Gong (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1606407)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java


> Separate AM failures from hardware failure or YARN error and do not count them to AM
retry count
> ------------------------------------------------------------------------------------------------
>
>                 Key: YARN-614
>                 URL: https://issues.apache.org/jira/browse/YARN-614
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>             Fix For: 2.5.0
>
>         Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, YARN-614-3.patch,
YARN-614-4.patch, YARN-614-5.patch, YARN-614-6.patch, YARN-614.10.patch, YARN-614.11.patch,
YARN-614.12.patch, YARN-614.13.patch, YARN-614.7.patch, YARN-614.8.patch, YARN-614.9.patch
>
>
> Attempts can fail due to a large number of user errors and they should not be retried
unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or
YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come
to mind.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message