lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-11911) TestLargeCluster.testSearchRate() failure
Date Fri, 16 Mar 2018 01:23:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401358#comment-16401358
] 

Hoss Man commented on SOLR-11911:
---------------------------------

bq. SOLR-11911: Wait a while for left-behind threads from executors.

Increasing the wait time just kicks the can down the road -- the real questions are:
# why these executor tasks aren't aborting quickly
#* If the Callable instances being submitted to the executors can take a non trivial amount
of time, then they should be checking the shutdown status of the executor frequently
# why the threads are being reported as leaks, instead of the test timing out when shutting
down the nodes
#* MiniSolrCcoudCluster.shutdown() calls shutdown on each of the jetty instances in independent
threads so they can be shutdown in parallel, but it still waits for all the jetties to finish
their shutdown before it let's the test finish -- and if the lifecycle of the executor is
beingmanaged correctly, souldn't the shutdown of the Solr node block until these autoscaling
executors finish their shutdown?
#* so even if one of these executor tasks was effectively blocked forever, shouldn't that
be causing the test to timeout, not report a leaked thread?


> TestLargeCluster.testSearchRate() failure
> -----------------------------------------
>
>                 Key: SOLR-11911
>                 URL: https://issues.apache.org/jira/browse/SOLR-11911
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Steve Rowe
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>
> My Jenkins found a branch_7x seed that reproduced 4/5 times for me:
> {noformat}
> Checking out Revision af9706cb89335a5aa04f9bcae0c2558a61803b50 (refs/remotes/origin/branch_7x)
> [...]
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestLargeCluster -Dtests.method=testSearchRate
-Dtests.seed=2D7724685882A83D -Dtests.slow=true -Dtests.locale=be-BY -Dtests.timezone=Africa/Ouagadougou
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
>    [junit4] FAILURE 1.24s J0  | TestLargeCluster.testSearchRate <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: The trigger did not fire
at all
>    [junit4]    > 	at __randomizedtesting.SeedInfo.seed([2D7724685882A83D:703F3AE197440E72]:0)
>    [junit4]    > 	at org.apache.solr.cloud.autoscaling.sim.TestLargeCluster.testSearchRate(TestLargeCluster.java:547)
>    [junit4]    > 	at java.lang.Thread.run(Thread.java:748)
> [...]
>    [junit4]   2> NOTE: test params are: codec=CheapBastard, sim=RandomSimilarity(queryNorm=true):
{}, locale=be-BY, timezone=Africa/Ouagadougou
>    [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_151
(64-bit)/cpus=16,threads=1,free=388243840,total=502267904
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message