hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7341) TestRouterWebServiceUtil#testMergeMetrics is flakey
Date Mon, 16 Oct 2017 23:47:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Kanter updated YARN-7341:
--------------------------------
    Attachment: YARN-7341.001.patch

It turns out that this is a real bug introduced by YARN-7095.  {{RouterWebServiceUtil#mergeMetrics}}
takes into two sets of metrics and merges them into the first one.  However, for a number
of the metrics, it actually simply doubles the first metric.  For example
{code:java}
metrics.setTotalNodes(metrics.getTotalNodes() + metrics.getTotalNodes());
{code}
should be
{code:java}
metrics.setTotalNodes(metrics.getTotalNodes() + metricsResponse.getTotalNodes());
{code}

This should have failed every time, but the test also had a "flaw", which only made it flakey.
 The test initializes two sets of metrics to random values using different {{Random}} objects
using {{System.getCurrentTimeMillis()}} for the seed.  However, the code is fast enough that
it often takes less than 1ms, causing the two objects to use the same seed.  When this happens,
the two sets of metrics have the same values, and will mask the bug I described earlier. 
If the code is slower (e.g. GC pause, swapping, adding a log statement for the seed, etc),
then you'll get different seed values and the test will (correctly) fail.

The 001 patch fixes the bug by using the correct metric in {{RouterWebServiceUtil#mergeMetrics}}.
 And it fixes the test by ensuring that the two seeds will be different.  It also cleans up
some formatting and logs the seed for better debugability.

> TestRouterWebServiceUtil#testMergeMetrics is flakey
> ---------------------------------------------------
>
>                 Key: YARN-7341
>                 URL: https://issues.apache.org/jira/browse/YARN-7341
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: federation
>    Affects Versions: 3.0.0-beta1
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-7341.001.patch
>
>
> {{TestRouterWebServiceUtil#testMergeMetrics}} is flakey.  It sometimes fails with something
like:
> {noformat}
> Running org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServiceUtil
> Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec <<<
FAILURE! - in org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServiceUtil
> testMergeMetrics(org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServiceUtil)
 Time elapsed: 0.005 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1092> but was:<584>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:555)
> 	at org.junit.Assert.assertEquals(Assert.java:542)
> 	at org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServiceUtil.testMergeMetrics(TestRouterWebServiceUtil.java:473)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message