hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-542) Change the default global AM max-attempts value to be not one
Date Thu, 11 Apr 2013 19:40:14 GMT

     [ https://issues.apache.org/jira/browse/YARN-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhijie Shen updated YARN-542:
-----------------------------

    Description: 
Today, the global AM max-attempts is set to 1 which is a bad choice. AM max-attempts accounts
for both AM level failures as well as container crashes due to localization issue, lost nodes
etc. To account for AM crashes due to problems that are not caused by user code, mainly lost
nodes, we want to give AMs some retires.

I propose we change it to atleast two. Can change it to 4 to match other retry-configs.

  was:
Today, the AM max-retries is set to 1 which is a bad choice. AM max-retries accounts for both
AM level failures as well as container crashes due to localization issue, lost nodes etc.
To account for AM crashes due to problems that are not caused by user code, mainly lost nodes,
we want to give AMs some retires.

I propose we change it to atleast two. Can change it to 4 to match other retry-configs.

    
> Change the default global AM max-attempts value to be not one
> -------------------------------------------------------------
>
>                 Key: YARN-542
>                 URL: https://issues.apache.org/jira/browse/YARN-542
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Zhijie Shen
>
> Today, the global AM max-attempts is set to 1 which is a bad choice. AM max-attempts
accounts for both AM level failures as well as container crashes due to localization issue,
lost nodes etc. To account for AM crashes due to problems that are not caused by user code,
mainly lost nodes, we want to give AMs some retires.
> I propose we change it to atleast two. Can change it to 4 to match other retry-configs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message