hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Jie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6249) TestFairSchedulerPreemption is inconsistently failing on trunk
Date Thu, 02 Mar 2017 14:13:45 GMT

    [ https://issues.apache.org/jira/browse/YARN-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892305#comment-15892305
] 

Tao Jie commented on YARN-6249:
-------------------------------

I debugged this test and detected the root cause of the failure.
In the test, FsLeafQueues are initialized before {{scheduler.setClock(clock)}} is called in
setup(). As a result, {{lastTimeAtMinShare}} in FsLeafQueue is initialized to the long value
of current time(a big number), and it will compare to the time of {{ControlledClock}} which
starts from 0.
In {{FsLeafQueue#minShareStarvation}} invoked in update()
{code}
    long now = scheduler.getClock().getTime();
    if (!starved) {
      // Record that the queue is not starved
      setLastTimeAtMinShare(now);
    }

    if (now - lastTimeAtMinShare < getMinSharePreemptionTimeout()) {
      // the queue is not starved for the preemption timeout
      starvation = Resources.clone(Resources.none());
    }
{code}
If {{starved}} is true here at the first time this method is called, this queue would never
satisfy the min preemption timeout.
However I don't think it is a bug in the real world, because this issue is related to ControlledClock
only used in test. 


> TestFairSchedulerPreemption is inconsistently failing on trunk
> --------------------------------------------------------------
>
>                 Key: YARN-6249
>                 URL: https://issues.apache.org/jira/browse/YARN-6249
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler, resourcemanager
>    Affects Versions: 2.9.0
>            Reporter: Sean Po
>            Assignee: Yufei Gu
>
> Tests in TestFairSchedulerPreemption.java will inconsistently fail on trunk. An example
stack trace: 
> {noformat}
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.879 sec <<<
FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption
> testPreemptionSelectNonAMContainer[MinSharePreemptionWithDRF](org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption)
 Time elapsed: 10.475 sec  <<< FAILURE!
> java.lang.AssertionError: Incorrect number of containers on the greedy app expected:<4>
but was:<8>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:555)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.verifyPreemption(TestFairSchedulerPreemption.java:288)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption.testPreemptionSelectNonAMContainer(TestFairSchedulerPreemption.java:363)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message