zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: making CI a more pleasant experience
Date Mon, 04 May 2015 21:13:11 GMT
Another possibility for speeding up pre-commit runs might be to execute
JUnit tests in parallel.  In build.xml, we'd specify the threads attribute
on the <junit> task to start multiple JUnit processes.  This requires Ant
1.9.4.

https://ant.apache.org/manual/Tasks/junit.html


It's not as simple as just deploying Ant 1.9.4 and turning it on though.
We'd need to make sure that test suites are capable of running
concurrently.  They'd need isolated working directories, port numbers,
etc.  Right now, the PortAssignment class hands out port numbers that are
guaranteed to be unique within the same process, but not across multiple
processes.  I just tried it locally, and many tests failed on bind
exceptions.  (This is actually a potential problem already if multiple
ZooKeeper pre-commit runs execute concurrently on the same Jenkins host.)

I'll investigate if we can change tests to bind to ephemeral ports, which
would give us uniqueness across multiple processes.

--Chris Nauroth




On 5/3/15, 10:10 PM, "Patrick Hunt" <phunt@apache.org> wrote:

>I've pushed back on use of sleep in non-deterministic ways in the past. I
>think we do a reasonable job there, just grepping for sleep doesn't tell
>the story.
>
>Where you run into issues is when you
>
>do x
>sleep(500)
>check x was success
>
>Most of our use of sleep has migrated to
>
>do x
>for (1 to 120) // or check elapsed time and cap at some large number
>  sleep(500) // make sufficiently small that you don't waste time waiting
>unnecessarily, but also not too short that you spin
>  check x was success
>
>unless we were able to make due without a time bound at all - sometimes we
>migrate to a latch or something.
>
>Now it's been a while since I reviewed the tests, new code might have
>added
>some bad checks again, it's a tough one to stamp out entirely.
>
>Re tests taking too long, I can't seem to find the jira, but iirc Henry
>had
>created a jira around reducing the tick time for tests - that
>significantly
>reduced the setup time for quorum based tests - a big part of overall
>overhead. We should probably categorize our tests and run a subset outside
>of a nightly "full test run".
>
>Patrick
>
>On Sun, May 3, 2015 at 9:49 PM, Raúl Gutiérrez Segalés
><rgs@itevenworks.net>
>wrote:
>
>> Hi,
>>
>> On 3 May 2015 at 12:53, Chris Nauroth <cnauroth@hortonworks.com> wrote:
>>
>> > (....)
>> > 3. Tests are non-deterministic, such as by hard-coding a sleep time to
>> > wait for an asynchronous action to complete.  The solutions usually
>> > involve providing hooks into lower-layer logic, such as to receive a
>> > callback from the asynchronous action, so that the test can be
>> > deterministic.
>> >
>>
>> Indeed:
>>
>> ~/src/zookeeper-svn/src/java/test/org/apache/zookeeper (master) ✔ git
>>grep
>> -i 'sleep(' | wc -l
>> 91
>>
>> Making runs shorter would be very helpful as well. Currently it just
>>takes
>> too long.
>>
>> Also, adding to what Patrick said, I'll take a closer look at the runs
>> reported at:
>>
>> https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/
>>
>> to have a better grasp of what's going on. Thanks!
>>
>>
>> -rgs
>>

Mime
View raw message