drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Barclay <dbarc...@maprtech.com>
Subject Threads left after Drillbit shutdown (in dev./unit tests)
Date Fri, 10 Jul 2015 20:10:02 GMT
Is Drill terminating threads correctly?

In running jstack on a JVM running a dev. test run that ended up hung
after getting about three test timeout errors, I see that there are
409 threads.

Although 138 of those are not-unexpected ShutdownHook threads (since
many tests are run in one VM), there are:
- 138 "WorkManager.StatusThread" threads (hmm.... 138 again)
-   7 "Client-1" threads
-   4 "UserServer-1" threads
-  21 "BitClient-1" threads
-   4 "BitClient-2" threads
-   3 "BitClient-3" threads
-   8 "BitServer-1" threads
-   8 "BitServer-2" threads
-   7 "BitServer-3" threads
-   7 "BitServer-4" threads
-   7 "BitServer-5" threads
-   6 "BitServer-6" threads
-   6 "BitServer-7" threads
-   6 "BitServer-8" threads
-   5 "BitServer-9" threads
-   5 "BitServer-10" threads
(Other thread names have only 1 or 2 occurrences.)

Regarding the 4 for the number of "UserServer-1" threads:  Three test
methods had timeout failures plus one got hung.


Here's the tail end of the output from the test running, including
all the timeout errors and including the hang (except for repeated
query-results data lines).



dbarclay@dev-linux2 ~/work/git/incubator-drill $ time mvn install

<TRIMMED>

Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun
Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeOneEntryRun
Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#twoBitOneExchangeTwoEntryRun
Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeTwoEntryRun
Running org.apache.drill.exec.physical.impl.TestDistributedFragmentRun#oneBitOneExchangeTwoEntryRunLogical
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 48.117 sec - in org.apache.drill.exec.physical.impl.TestDistributedFragmentRun
Running org.apache.drill.exec.physical.impl.TestBroadcastExchange
Running org.apache.drill.exec.physical.impl.TestBroadcastExchange#TestSingleBroadcastExchangeWithTwoScans
00:44:34.017 [globalEventExecutor-1-523] ERROR o.a.z.server.NIOServerCnxnFactory - Thread
Thread[globalEventExecutor-1-523,5,main] died
java.lang.AssertionError: null
	at io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(AbstractScheduledEventExecutor.java:83)
~[netty-common-4.0.27.Final.jar:4.0.27.Final]
	at io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue(GlobalEventExecutor.java:110)
~[netty-common-4.0.27.Final.jar:4.0.27.Final]
	at io.netty.util.concurrent.GlobalEventExecutor.takeTask(GlobalEventExecutor.java:95) ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
	at io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:226)
~[netty-common-4.0.27.Final.jar:4.0.27.Final]
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
~[netty-common-4.0.27.Final.jar:4.0.27.Final]
	at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_72]
Running org.apache.drill.exec.physical.impl.TestBroadcastExchange#TestMultipleSendLocationBroadcastExchange
10000
Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 111.599 sec <<< FAILURE!
- in org.apache.drill.exec.physical.impl.TestBroadcastExchange
TestSingleBroadcastExchangeWithTwoScans(org.apache.drill.exec.physical.impl.TestBroadcastExchange)
 Time elapsed: 50.063 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:503)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
	at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.rpc.data.DataConnectionCreator.close(DataConnectionCreator.java:70)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:88)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
	at org.apache.drill.exec.physical.impl.TestBroadcastExchange.TestSingleBroadcastExchangeWithTwoScans(TestBroadcastExchange.java:62)

TestMultipleSendLocationBroadcastExchange(org.apache.drill.exec.physical.impl.TestBroadcastExchange)
 Time elapsed: 50.014 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:503)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
	at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
	at org.apache.drill.exec.rpc.user.UserServer.close(UserServer.java:283)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:87)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
	at org.apache.drill.exec.physical.impl.TestBroadcastExchange.TestMultipleSendLocationBroadcastExchange(TestBroadcastExchange.java:88)

Running org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender
Running org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender#testPartitionSenderCostToThreads
Running org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender#testAlgorithm
ok	summary
true	planner.slice_target updated.
Total rows returned : 1.  Returned in 38ms.
Jul 10, 2015 12:47:20 AM org.apache.calcite.sql.validate.SqlValidatorException <init>
SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 'dfs./home/dbarclay/work/git/incubator-drill/exec/java-exec/target/junit5218680774082947123/junit6147831434075535799'
not found
Jul 10, 2015 12:47:20 AM org.apache.calcite.runtime.CalciteException <init>
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 64 to line
1, column 66: Table 'dfs./home/dbarclay/work/git/incubator-drill/exec/java-exec/target/junit5218680774082947123/junit6147831434075535799'
not found
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.904 sec <<< FAILURE!
- in org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender
testPartitionSenderCostToThreads(org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender)
 Time elapsed: 50.023 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:503)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
	at org.apache.drill.exec.rpc.BasicServer.close(BasicServer.java:218)
	at org.apache.drill.exec.rpc.user.UserServer.close(UserServer.java:283)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.service.ServiceEngine.close(ServiceEngine.java:87)
	at com.google.common.io.Closeables.close(Closeables.java:77)
	at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
	at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:288)
	at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:239)
	at org.apache.drill.BaseTestQuery.updateTestCluster(BaseTestQuery.java:126)
	at org.apache.drill.exec.physical.impl.partitionsender.TestPartitionSender.testPartitionSenderCostToThreads(TestPartitionSender.java:154)

Running org.apache.drill.exec.physical.impl.xsort.TestSimpleExternalSort
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec - in org.apache.drill.exec.physical.impl.xsort.TestSimpleExternalSort
Running org.apache.drill.exec.physical.impl.svremover.TestSVRemover
Running org.apache.drill.exec.physical.impl.svremover.TestSVRemover#testSelectionVectorRemoval
blue	red	green
2147483647	9223372036854775807	2147483647
2147483647	9223372036854775807	2147483647

<TRIMMED>

2147483647	9223372036854775807	2147483647
Total rows returned : 50.  Returned in 181ms.
Running org.apache.drill.exec.physical.impl.svremover.TestSVRemover#testSVRWithNoFilter
blue	red	green
true	-9223372036854775808	-2147483648
false	9223372036854775807	null

<TRIMMED>

true	-9223372036854775808	-2147483648
false	9223372036854775807	null
Total rows returned : 100.  Returned in 54ms.
   C-c C-c
real	708m38.962s
user	11m3.332s
sys	1m8.068s
[130]dbarclay@dev-linux2 ~/work/git/incubator-drill $





Daniel
-- 
Daniel Barclay
MapR Technologies

Mime
View raw message