hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2649) Flaky test TestAMRMRPCNodeUpdates
Date Mon, 06 Oct 2014 23:32:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161210#comment-14161210
] 

Jian He commented on YARN-2649:
-------------------------------

[~mingma], thanks for working on this !
bq. Another way to fix it is to change MockRM.submitApp to waitForState on RMAppAttempt. That
might address other test cases that use MockRM.submitApp.
I recently saw some other similar test failure e.g. YARN-2483.  maybe this is what we should
do.  could you also run all tests locally, in case we don't introduce regression failure?
thx 

> Flaky test TestAMRMRPCNodeUpdates
> ---------------------------------
>
>                 Key: YARN-2649
>                 URL: https://issues.apache.org/jira/browse/YARN-2649
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Ming Ma
>         Attachments: YARN-2649.patch
>
>
> Sometimes the test fails with the following error:
> testAMRMUnusableNodes(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates)
 Time elapsed: 41.73 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:<ALLOCATED>
but was:<SCHEDULED>
> 	at junit.framework.Assert.fail(Assert.java:50)
> 	at junit.framework.Assert.failNotEquals(Assert.java:287)
> 	at junit.framework.Assert.assertEquals(Assert.java:67)
> 	at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
> 	at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:382)
> 	at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes(TestAMRMRPCNodeUpdates.java:125)
> When this happens, SchedulerEventType.NODE_UPDATE was processed before RMAppAttemptEvent.ATTEMPT_ADDED
was processed. That is possible, given the test only waits for RMAppState.ACCEPTED before
having NM sending heartbeat. This can be reproduced using custom AsyncDispatcher with CountDownLatch.
Here is the log when this happens.
> {noformat}
> App State is : ACCEPTED
> 2014-10-05 21:25:07,305 INFO  [AsyncDispatcher event handler] attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(670)) - appattempt_1412569506932_0001_000001 State change from
NEW to SUBMITTED
> 2014-10-05 21:25:07,305 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164))
- Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStatusEvent.EventType:
STATUS_UPDATE
> 2014-10-05 21:25:07,305 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl (RMNodeImpl.java:handle(384))
- Processing 127.0.0.1:1234 of type STATUS_UPDATE
> AppAttempt : appattempt_1412569506932_0001_000001 State is : SUBMITTED Waiting for state
: ALLOCATED
> 2014-10-05 21:25:07,306 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164))
- Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.AppAttemptAddedSchedulerEvent.EventType:
APP_ATTEMPT_ADDED
> 2014-10-05 21:25:07,328 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164))
- Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent.EventType:
NODE_UPDATE
> 2014-10-05 21:25:07,330 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164))
- Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptEvent.EventType:
ATTEMPT_ADDED
> 2014-10-05 21:25:07,331 DEBUG [AsyncDispatcher event handler] attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(658)) - Processing event for appattempt_1412569506932_0001_000
> 001 of type ATTEMPT_ADDED
> 2014-10-05 21:25:07,333 INFO  [AsyncDispatcher event handler] attempt.RMAppAttemptImpl
(RMAppAttemptImpl.java:handle(670)) - appattempt_1412569506932_0001_000001 State change from
SUBMITTED to SCHEDULED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message