hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8010) Add config in FederationRMFailoverProxy to not bypass facade cache when failing over
Date Wed, 28 Mar 2018 01:00:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416532#comment-16416532
] 

Hudson commented on YARN-8010:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13888 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13888/])
YARN-8010. Add config in FederationRMFailoverProxy to not bypass facade (subru: rev 2a2ef15caf791f30c471526c1b74e68803f0c405)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationRMFailoverProxyProvider.java


> Add config in FederationRMFailoverProxy to not bypass facade cache when failing over
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-8010
>                 URL: https://issues.apache.org/jira/browse/YARN-8010
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Botong Huang
>            Assignee: Botong Huang
>            Priority: Minor
>             Fix For: 2.10.0, 2.9.1, 3.1.1
>
>         Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in AMRMProxy
will perform failover, try to get latest subcluster info from FederationStateStore and then
retry connect to the latest YarnRM master. When calling getSubCluster() to FederationStateStoreFacade,
it bypasses the cache with a flush flag. When YarnRM is failing over, every AM heartbeat thread
creates a different thread inside FederationInterceptor, each of which keeps performing failover
several times. This leads to a big spike of getSubCluster call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), YarnRM master
slave change might not result in RM addr change. In other cases, a small delay of getting
latest subcluster information may be acceptable. This patch thus creates a config option,
so that it is possible to ask the FederationRMFailoverProxy to not flush cache when calling
getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message