hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Dhoot (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-4180) AMLauncher does not retry on failures when talking to NM
Date Thu, 17 Sep 2015 21:16:04 GMT
Anubhav Dhoot created YARN-4180:
-----------------------------------

             Summary: AMLauncher does not retry on failures when talking to NM 
                 Key: YARN-4180
                 URL: https://issues.apache.org/jira/browse/YARN-4180
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
            Reporter: Anubhav Dhoot
            Assignee: Anubhav Dhoot


We see issues with RM trying to launch a container while a NM is restarting and we get exceptions
like NMNotReadyException. While YARN-3842 added retry for other clients of NM (AMs mainly)
its not used by AMLauncher in RM causing there intermittent errors to cause job failures.
This can manifest during rolling restart of NMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message