hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4350) TestDistributedShell fails for V2 scenarios
Date Fri, 18 Dec 2015 16:38:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064170#comment-15064170
] 

Naganarasimha G R commented on YARN-4350:
-----------------------------------------

Hi [~varun_saxena], As discussed offline, this seems to be a problem with the Distributed
shell AM. {{TestDistributedShell.checkTimelineV1}} checks whether only 2 (requested) containers
are being launched. But in reality more than 2 are getting launched. 
possible reasons for it are :
* when RM has assigned additional containers and the Distributed shell AM is launching it.
I had observed similar behavior of over assigning in MR also but MR AM takes care returning
the extra apps assigned by the RM. Similar approach should exist in Distributed shell AM too.
* RM has killed for some reason and extra Container is reached

Not sure which of these cases is causing the assigning of additional containers, to analyze
this we require more RM and AM logs which test case logs are not providing and further its
not related to the fixes of this issue. IMO its also possible to come in trunk too. So i think
we can raise another jira to track this !

> TestDistributedShell fails for V2 scenarios
> -------------------------------------------
>
>                 Key: YARN-4350
>                 URL: https://issues.apache.org/jira/browse/YARN-4350
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-4350-feature-YARN-2928.001.patch, YARN-4350-feature-YARN-2928.002.patch,
YARN-4350-feature-YARN-2928.003.patch
>
>
> Currently TestDistributedShell does not pass on the feature-YARN-2928 branch. There seem
to be 2 distinct issues.
> (1) testDSShellWithoutDomainV2* tests fail sporadically
> These test fail more often than not if tested by themselves:
> {noformat}
> testDSShellWithoutDomainV2DefaultFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
 Time elapsed: 30.998 sec  <<< FAILURE!
> java.lang.AssertionError: Application created event should be published atleast once
expected:<1> but was:<0>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:555)
> 	at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:451)
> 	at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:326)
> 	at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow(TestDistributedShell.java:207)
> {noformat}
> They start happening after YARN-4129. I suspect this might have to do with some timing
issue.
> (2) the whole test times out
> If you run the whole TestDistributedShell test, it times out without fail. This may or
may not have to do with the port change introduced by YARN-2859 (just a hunch).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message