[ https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588205#comment-16588205 ] genericqa commented on YARN-8696: --------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 48s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 57s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 1s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 225 unchanged - 0 fixed = 230 total (was 225) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 42s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}210m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8696 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936523/YARN-8696.v1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bca6ce8c969f 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9c3fc3e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21654/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/21654/artifact/out/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21654/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21654/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21654/testReport/ | | Max. process+thread count | 908 (vs. ulimit of 10000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21654/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > FederationInterceptor upgrade: home sub-cluster heartbeat async > --------------------------------------------------------------- > > Key: YARN-8696 > URL: https://issues.apache.org/jira/browse/YARN-8696 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Major > Attachments: YARN-8696.v1.patch > > > Today in _FederationInterceptor_, the heartbeat to home sub-cluster is synchronous. After the heartbeat is sent out to home sub-cluster, it waits for the home response to come back before merging and returning the (merged) heartbeat result to back AM. If home sub-cluster is suffering from connection issues, or down during an YarnRM master-slave switch, all heartbeat threads in _FederationInterceptor_ will be blocked waiting for home response. As a result, the successful UAM heartbeats from secondary sub-clusters will not be returned to AM at all. Additionally, because of the fact that we kept the same heartbeat responseId between AM and home RM, lots of tricky handling are needed regarding the responseId resync when it comes to _FederationInterceptor_ (part of AMRMProxy, NM) work preserving restart (YARN-6127, YARN-1336), home RM master-slave switch etc. > In this patch, we change the heartbeat to home sub-cluster to asynchronous, same as the way we handle UAM heartbeats in secondaries. So that any sub-cluster down or connection issues won't impact AM getting responses from other sub-clusters. The responseId is also managed separately for home sub-cluster and AM, and they increment independently. The resync logic becomes much cleaner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org