couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <m...@jan.io>
Subject Re: [RFC] On the Testing of CouchDB
Date Sat, 16 Dec 2017 21:29:27 GMT

On 16. Dec 2017, at 18:56, Paul Davis <paul.joseph.davis@gmail.com> wrote:

>> The one thing that would be nice here if it were easy to disable certain
>> tests or suites that make no sense in the pouchdb-server environment, so
>> they can easily integrate it in their CI.
> 
> The cool thing is that Elixir supports this natively in that you can
> add tags to test to selectively enable/disable test classes so this
> will just be a matter of letting the pouchdb team disable anything
> that doesn't make sense for their implementaiton.
> 
>> It would be great if we could use this opportunity to apply this across
>> all JS test files when we port them to Elixir. It means a little bit
>> more work per test file, but I hope with a few more contributors and
>> guidelines, this is an easily paralleliseable task, so individual burden
>> can be minimised.
> 
> My current approach so far is to try and first port the test directly
> and then afterwards go back through and refactor things to be more
> traditional. My thinking here was that the initial port could be
> reviewed alongside the existing JS test to double check that we're not
> dropping any important tests or assertions along the way before we
> start moving a lot of code around. That said, Elixir still allows us
> to break things. So for instance the replication.js tests I've broken
> up into a number of functions that still follow the same order as the
> original suite but once the initial port is done it'll be trivial to
> split that out into a base class and then have each test extend from
> there. Its also possible to generate tests too so the replication
> tests that check for all combinations of local/remote source/target
> pairs end up as separate tests.
> 
>> I noticed that one of the reduce tests took 30+ seconds to run on my
>> machine and I experimented with different cluster configuration values
>> and to nobodys surprise, the default of q=8 is the main factor in view
>> test execution speed. q=4 takes ~20s, q=2 ~10s and q=1 ~5s. I’m not
>> suggesting we set q=1 for all tests since q>1 is a behaviour we would
>> want to test as well, but maybe we can set q=2 when running the test
>> suite(s) for the time being. Shaving 25s off of a single test will get
>> us a long way with all tests ported. What do others think?
> 
> I've noticed some pretty terrible slowness on OS X (which I'm assuming
> you're running on) and chatting with Russel it appears that on Linux
> there's a massive speed difference when running tests. I'd very much
> prefer to keep our tests against a Q=3 cluster.

Thanks for the clarification, you too Russell! Just a nit, our default is 8, not 3, but that's
still a lot faster than 8 ;)

As for the performance difference: on Darwin, Erlang does a F_FULLFSYNC which does some magic
to coerce otherwise lying hard drives to flush their caches, something that Linux fsync()
doesn't do. On spinning disk, this meant ~3 file:fsync()/s on Darwin vs. a lot more on Linux.
Multiplied by n=3 x q=8 them's a lot of F_FULLFSYNC to go around. I don't know about SSD's
though, so this is somewhat speculative :)

> I'd like to try and
> dig in a bit to see if we can't figure out where we're having such a
> dramatic time difference between the two. Hopefully some quick
> measuring will point us to a knob to adjust to speed things up without
> sacrificing cluster nodes during the tests.


Mime
View raw message