hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-4890) Unit test intermittent failure: TestNodeLabelContainerAllocation#testQueueUsedCapacitiesUpdate
Date Sat, 09 Apr 2016 06:07:25 GMT

     [ https://issues.apache.org/jira/browse/YARN-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sunil G updated YARN-4890:
--------------------------
    Attachment: 0001-YARN-4890.patch

Mostly I agree with [~bibinchundatt] and thanks Bibin for the pointer. I started analyzing
this issue after I met with this failure in YARN-4934.
I am able to reproduce this issue with debug points and I think attached fix will resolve
the pblm

Analysis:
- {{waitSchedulerNodeJoined}} is depending upon {{nodeTracker}} count. This will be updated
as first step when a new node is added to CS via {{addNode}} call. (from {{NODE_ADDED}} event
handling)
- After updating node to {{nodeTracker}} , new node change information is updated to LabelManager
with {{labelManager.activateNode()}} call. This internally invokes {{updateResourceMappings}}
method and it tries to update Scheduler with {{NODE_LABELS_UPDATE}} event.
- In this test case since  node is added earlier to {{nodeTracker}}, there are chances that
the test case will resume and continue check for capacity metrics check. But many a time,
its possible that Labels are not updated to {{SchedulerNode}} via  {{NODE_LABELS_UPDATE}}.
- {{updateLabelsOnNode}} is updating labels to {{FiCaSchedulerNode}}. So ideally its better
this test case can check whether the intended label is added to Node also.

I have updated a patch for same with this improvement. [~bibinchundatt]/[~leftnoteasy] Thoughts?

> Unit test intermittent failure: TestNodeLabelContainerAllocation#testQueueUsedCapacitiesUpdate
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-4890
>                 URL: https://issues.apache.org/jira/browse/YARN-4890
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Wangda Tan
>         Attachments: 0001-YARN-4890.patch
>
>
> Message:
> {code}
> Tests run: 16, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 314.062 sec <<<
FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
> testQueueUsedCapacitiesUpdate(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation)
 Time elapsed: 12.426 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<0.3> but was:<0.6>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:519)
> 	at org.junit.Assert.assertEquals(Assert.java:609)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation.checkQueueUsedCapacity(TestNodeLabelContainerAllocation.java:1163)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation.testQueueUsedCapacitiesUpdate(TestNodeLabelContainerAllocation.java:1382)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message