tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Thomas <>
Subject Re: Current sporadic test suite failures
Date Wed, 18 Dec 2013 13:27:45 GMT
On 18/12/2013 13:20, Rainer Jung wrote:
> On 18.12.2013 13:12, Rainer Jung wrote:
>> On 17.12.2013 11:01, Rainer Jung wrote:
>>> A) TC 7
>>> =======
>>> 3) org.apache.tomcat.websocket.TestWsSubprotocols
>>> -------------------------------------------------
>>> 1 x bio, 1 x nio
>>> Failure:
>>> Testcase: testWsSubprotocols took 4.005 sec
>>>         Caused an ERROR
>>> null
>>> java.lang.NullPointerException
>>>         at
>>> org.apache.tomcat.websocket.TestWsSubprotocols.testWsSubprotocols(
>> Added logging shows:
>> - Most often the sequence is SubProtocolsEndpoint.processOpen(), then
>> the subprotocol assertion check which comes after
>> wsContainer.connectToServer() and wsSession.isOpen() and finally another
>> SubProtocolsEndpoint.processOpen(). In this case the test succeeds.
>> - Rarely the assertion check comes after both
>> SubProtocolsEndpoint.processOpen() calls, then the test also succeeds
>> - Rarely the assertion check comes before the two calls to
>> SubProtocolsEndpoint.processOpen(), then the test fails with NPE,
>> because SubProtocolsEndpoint.subprotocols wasn't assigned yet (that
>> happens in SubProtocolsEndpoint.processOpen()).
>> Would it be OK to add a short sleep before the assertion or does that
>> changed order of execution in fact indicate a server side problem
>> instead of a test impl problem?
> OK, it is expected that sometimes the main (client) thread proceeds to
> fast and the server side hasn't yet done what the client expects, so the
> subprotocols side-effect that is used to check for success hasn't yet
> materialized. Adding a 100ms sleep seems to be long enough (probably in
> combination with the associated thread yield) to allow the server side
> to make progress.

Based on previous experience a fixed sleep approach can be fragile. It
often continued to fail on heavily loaded systems (like the CI system
often is). The wait up to ~5 seconds with a while loop that tests every
10/50/100ms (take your pick) approach was more robust and tended to
result in more stable test results.

> I also added a reset for the subprotocols field to the end of the two
> tests, so that the first test doesn't influence the result of the second
> (which was actually the case for most runs here).

Good idea.

> So this was also only a problem in the test impl.

Agreed. Thanks for looking into this.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message