hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
Date Mon, 20 Aug 2018 19:49:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586410#comment-16586410
] 

Hudson commented on YARN-8673:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14805 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14805/])
YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM (gifuma: rev 8736fc39ac3b3de168d2c216f3d1c0edb48fb3f9)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/AMRMClientUtils.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/AMRMClientRelayer.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedApplicationManager.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/TestAMRMClientRelayer.java


> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -----------------------------------------------------------------------------
>
>                 Key: YARN-8673
>                 URL: https://issues.apache.org/jira/browse/YARN-8673
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: amrmproxy
>            Reporter: Botong Huang
>            Assignee: Botong Huang
>            Priority: Major
>         Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ will be thrown
from the new YarnRM. AM will re-regsiter and reset the responseId to zero. _AMRMClientRelayer_
inside _FederationInterceptor_ follows the same protocol, and does the automatic re-register
and responseId resync. However, when exceptions or temporary network issue happens in the
allocate call after re-register, the resync logic might be broken. This patch improves the
robustness of the process by parsing the expected repsonseId from YarnRM exception message.
So that whenever the responseId is out of sync for whatever reason, we can automatically resync
and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message