hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2462) TestNodeManagerResync#testBlockNewContainerRequestsOnStartAndResync should have a test timeout
Date Thu, 28 Aug 2014 13:58:08 GMT

    [ https://issues.apache.org/jira/browse/YARN-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113770#comment-14113770
] 

Jason Lowe commented on YARN-2462:
----------------------------------

Recent Jenkins builds have been failing with address bind exceptions.  When I looked on the
Jenkins machine I found a TestNodeManagerResync session that had been hung for weeks.  Here's
a portion of the jstack:

{noformat}
"main" prio=10 tid=0x00007fe1b4008000 nid=0x7b73 waiting on condition [0x00007fe1b95ce000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c0127ab0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
	at java.util.concurrent.CyclicBarrier.dowait(CyclicBarrier.java:199)
	at java.util.concurrent.CyclicBarrier.await(CyclicBarrier.java:327)
	at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync.testBlockNewContainerRequestsOnStartAndResync(TestNodeManagerResync.java:178)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{noformat}

So we're stuck in the testBlockNewContainerRequestsOnStartAndResync test, which I noticed
doesn't have a test timeout specified.  Given it can sometimes hang indefinitely, we should
add a timeout to keep it from lingering around when things go wrong.

> TestNodeManagerResync#testBlockNewContainerRequestsOnStartAndResync should have a test
timeout
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-2462
>                 URL: https://issues.apache.org/jira/browse/YARN-2462
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>
> TestNodeManagerResync#testBlockNewContainerRequestsOnStartAndResync can hang indefinitely
and should have a test timeout.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message