drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudheesh Katkam <skat...@maprtech.com>
Subject Re: TestDrillbitResilience broken? assertion errors; now slow/hung, with 278 threads!
Date Wed, 29 Apr 2015 15:07:05 GMT
*ran the tests before checking them in. 

> On Apr 29, 2015, at 7:53 AM, Sudheesh Katkam <skatkam@maprtech.com> wrote:
> 
> I am responsible for those tests. I ran the tests at least 10 times on my Linux VM with
1 second pauses, all of which passed. 
> 
> On your second run, what different errors did you see?
> 
> On your third run, are you able to reproduce the test case the hangs?
> 
> Sorry that the message is not informative. I already have a patch which is a slight improvement
to Jacques change that improves the message in those tests.  
> 
> What tool did you use to get the thread count?
> 
> - Sudheesh
> 
> Sent from my iPhone. Pardon any typos.
> 
>> On Apr 29, 2015, at 6:28 AM, Abdel Hakim Deneche <adeneche@maprtech.com> wrote:
>> 
>> The message displayed in the first run contains actually two different
>> issues:
>> 
>> 1. The error message "Error shutting down Drillbit 'beta'" is most likely
>> caused by this issue DRILL-2878
>> <https://issues.apache.org/jira/browse/DRILL-2878>
>> 
>> 2. The test that failed with an "java.lang.AssertionError: null" is most
>> likely a bug because that unit test should not fail. I've seen this error
>> before, but it only happens intermittently.
>> 
>> The system error reported in the 3rd run is actually an "expected" injected
>> exception, but 278 threads looks suspicious!!!
>> 
>> On Wed, Apr 29, 2015 at 12:13 AM, Daniel Barclay <dbarclay@maprtech.com>
>> wrote:
>> 
>>> Does anyone know what's going on with TestDrillbitResilience (rebased
>>> from master today)?  (Is it working right?)
>>> 
>>> 
>>> One run, via "mvn install", yielded assertion errors:
>>> 
>>> ...
>>> Error shutting down Drillbit "beta".
>>> Tests run: 11, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 33.811
>>> sec <<< FAILURE! - in org.apache.drill.exec.server.TestDrillbitResilience
>>> cancelAfterEverythingIsCompleted(org.apache.drill.exec.server.TestDrillbitResilience)
>>> Time elapsed: 1.468 sec  <<< FAILURE!
>>> java.lang.AssertionError: null
>>>       at
>>> org.apache.drill.exec.server.TestDrillbitResilience.assertCancelled(TestDrillbitResilience.java:459)
>>>       at
>>> org.apache.drill.exec.server.TestDrillbitResilience.cancelAfterEverythingIsCompleted(TestDrillbitResilience.java:565)
>>> 
>>> cancelInMiddleOfFetchingResults(org.apache.drill.exec.server.TestDrillbitResilience)
>>> Time elapsed: 1.496 sec  <<< FAILURE!
>>> java.lang.AssertionError: null
>>>       at
>>> org.apache.drill.exec.server.TestDrillbitResilience.assertCancelled(TestDrillbitResilience.java:459)
>>>       at
>>> org.apache.drill.exec.server.TestDrillbitResilience.cancelInMiddleOfFetchingResults(TestDrillbitResilience.java:510)
>>> 
>>> Running <next test>
>>> ...
>>> 
>>> 
>>> A second run, run individually (but still via Maven) died with different
>>> errors.
>>> 
>>> 
>>> 
>>> A third run, via "mvn install" again, seems hung after reporting this
>>> (maybe expected) exception:
>>> 
>>> Exception (no rows returned):
>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>>> run-try-end
>>> 
>>> 
>>> [fb9cfe61-af6e-4c9c-b6ab-8a1b8725c6e9 on dev-linux2:31010]
>>> 
>>> 
>>> The process is using only about 5% CPU--but has 278 threads!
>>> (That includes about 35 threads all with the same name of "BitClient-1".)
>>> 
>>> 
>>> Daniel
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Daniel Barclay
>>> MapR Technologies
>> 
>> 
>> 
>> -- 
>> 
>> Abdelhakim Deneche
>> 
>> Software Engineer
>> 
>> <http://www.mapr.com/>
>> 
>> 
>> Now Available - Free Hadoop On-Demand Training
>> <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Mime
View raw message