hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Dhoot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4185) Retry interval delay for NM client can be improved from the fixed static retry
Date Mon, 05 Oct 2015 18:45:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943809#comment-14943809
] 

Anubhav Dhoot commented on YARN-4185:
-------------------------------------

I don't think option 2 where you restart from 1 makes sense. Its also not a goal to minimize
the total wait time. The goal should be to minimize the time to recover for short intermittent
failure while also waiting long enough for long failures before giving up. Would it be better
for us to ramp up to 10 sec exponentially and then do the n retries for 10 sec or do totally
n retries including the ramp up.

> Retry interval delay for NM client can be improved from the fixed static retry 
> -------------------------------------------------------------------------------
>
>                 Key: YARN-4185
>                 URL: https://issues.apache.org/jira/browse/YARN-4185
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Anubhav Dhoot
>            Assignee: Neelesh Srinivas Salian
>
> Instead of having a fixed retry interval that starts off very high and stays there, we
are better off using an exponential backoff that has the same fixed max limit. Today the retry
interval is fixed at 10 sec that can be unnecessarily high especially when NMs could rolling
restart within a sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message