[ https://issues.apache.org/jira/browse/YARN-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084010#comment-16084010
]
Hadoop QA commented on YARN-6102:
---------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color}
| {color:green} The patch appears to include 3 new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 27s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} |
{color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color}
| {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
The patch generated 0 new + 137 unchanged - 6 fixed = 137 total (was 143) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 12s{color} | {color:red}
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated
1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s{color} | {color:red}
hadoop-yarn-server-resourcemanager in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 28s{color} | {color:red}
hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 18s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
|
| | Unused field:ResourceManager.java |
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6102 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12876831/YARN-6102.04.patch
|
| Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs
checkstyle |
| uname | Linux a43864a32289 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b628d0d |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/16382/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
|
| javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/16382/artifact/patchprocess/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
|
| unit | https://builds.apache.org/job/PreCommit-YARN-Build/16382/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16382/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16382/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
> On failover RM can crash due to unregistered event to AsyncDispatcher
> ---------------------------------------------------------------------
>
> Key: YARN-6102
> URL: https://issues.apache.org/jira/browse/YARN-6102
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.8.0, 2.7.3
> Reporter: Ajith S
> Assignee: Rohith Sharma K S
> Priority: Critical
> Attachments: eventOrder.JPG, YARN-6102.01.patch, YARN-6102.02.patch, YARN-6102.03.patch,
YARN-6102.04.patch
>
>
> {code}2017-01-17 16:42:17,911 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher
(AsyncDispatcher.java:dispatch(200)) - Error in dispatcher thread
> java.lang.Exception: No handler for registered for class org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:196)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:120)
> at java.lang.Thread.run(Thread.java:745)
> 2017-01-17 16:42:17,914 INFO [AsyncDispatcher ShutDown handler] event.AsyncDispatcher
(AsyncDispatcher.java:run(303)) - Exiting, bbye..{code}
> The same stack i was also noticed in {{TestResourceTrackerOnHA}} exits abnormally, after
some analysis, i was able to reproduce.
> Once the nodeHeartBeat is sent to RM, inside {{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)}},
before sending it to dispatcher through
> {{this.rmContext.getDispatcher().getEventHandler().handle(nodeStatusEvent);}} if RM failover
is called, the dispatcher is reset
> The new dispatcher is however first started and then the events are registered at {{org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(boolean)}}
> So event order will look like
> 1. Send Node heartbeat to {{ResourceTrackerService}}
> 2. In {{ResourceTrackerService.nodeHeartbeat}}, before passing to dispatcher call RM
failover
> 3. In RM Failover, current active will reset dispatcher @reinitialize i.e ( {{resetDispatcher();}}
+ {{createAndInitActiveServices();}} )
> Now between {{resetDispatcher();}} and {{createAndInitActiveServices();}} , the {{ResourceTrackerService.nodeHeartbeat}}
invokes dipatcher
> This will cause the above error as at point of time when {{STATUS_UPDATE}} event is given
to dispatcher in {{ResourceTrackerService}} , the new dispatcher(from the failover) may be
started but not yet registered for events
> Using same steps(with pausing JVM at debug), i was able to reproduce this in production
cluster also. for {{STATUS_UPDATE}} active service event, when the service is yet to forward
the event to RM dispatcher but a failover is called and dispatcher reset is between {{resetDispatcher();}}
& {{createAndInitActiveServices();}}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org
|