hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Badger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
Date Wed, 01 Jun 2016 19:45:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310976#comment-15310976
] 

Eric Badger commented on YARN-1468:
-----------------------------------

[~mitdesai], I saw this test failing in the same way that you described above. I took a look
at the test and I either don't understand the meaning of one of the lines or it's a bug. The
following piece of code (minus the assertEquals) was added by [YARN-1493|https://issues.apache.org/jira/browse/YARN-1493]
and doesn't make sense to me. Why are we checking the size against 2 when we are checking
it against 4 immediately after? In my local tests, this loop times out once timeoutSecs >=
40 since rmApp.getAttempts.size() is equal to 4 the whole time. This leads me to believe that
the assert failure would occur when this loop is executed and the size is actually equal to
2 initially. That way it would break out of the loop early and only get up to 3 (or stay at
2) before the assertEquals against 4 is executed. 

{noformat}
    // wait for the attempt to be created.
    int timeoutSecs = 0;
    while (rmApp.getAppAttempts().size() != 2 && timeoutSecs++ < 40) {
      Thread.sleep(200);
    }
    Assert.assertEquals(4, rmApp.getAppAttempts().size());
{noformat}

I think changing ".size() != 2" to ".size() != 4" will fix this race in the test. Thoughts?


cc [~djp]

> TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
> ----------------------------------------------------------------
>
>                 Key: YARN-1468
>                 URL: https://issues.apache.org/jira/browse/YARN-1468
>             Project: Hadoop YARN
>          Issue Type: Test
>          Components: resourcemanager
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>
> Log is as following:
> {code}
> Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec <<<
FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
 Time elapsed: 44.197 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:<ALLOCATED>
but was:<SCHEDULED>
>         at junit.framework.Assert.fail(Assert.java:50)
>         at junit.framework.Assert.failNotEquals(Assert.java:287)
>         at junit.framework.Assert.assertEquals(Assert.java:67)
>         at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
>         at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292)
>         at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826)
>         at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message