Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
Received-SPF: pass (nike.apache.org: domain of wendallc@83864.com designates
 209.85.214.180 as permitted sender)
Sender: Wendall Cada <wendallc@83864.com>
Message-ID: <51631F9F.4020501@apache.org>
Date: Mon, 08 Apr 2013 12:50:55 -0700
From: Wendall Cada <wendallc@apache.org>
User-Agent: Mozilla/5.0 (X11; Linux i686;
 rv:17.0) Gecko/20130311 Thunderbird/17.0.4
MIME-Version: 1.0
To: dev@couchdb.apache.org
Subject: Re: Javascript Test Suite
References: <51535F0B.7060408@apache.org> <515F64E9.9050207@apache.org>
 <CAJ_m3YAE2Ae9r8abHmFYty1-EfingB0kF3yjrAPOLv8fH9COKw@mail.gmail.com>
In-Reply-To: 
 <CAJ_m3YAE2Ae9r8abHmFYty1-EfingB0kF3yjrAPOLv8fH9COKw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Thanks for the feedback on this Paul. There are a few other comments in 
this thread around the same issue. I believe we have a consensus that 
using couchjs isn't the way forward because of limitations. I'm going to 
start a new thread with a proposal around the ideas here so we can 
discuss where this all fits into the roadmap.

I'll merge this with master when I get a chance. I also will need to 
pull the relevant parts into 1.3.x.

Wendall

On 04/08/2013 07:26 AM, Paul Davis wrote:
> The first bit I'd like to say is that the use of couchjs was just a
> stop gap measure to get the test suite out of the browser. We used to
> have to deal with so many browser issues it was just a terrible mess.
> The issue with couchjs is much as you've seen that its not a very full
> environment for writing tests. So just to be clear that the only real
> thing tying us to that as a test platform is that we have a large
> amount of JS written already so either we need to make the couchjs
> better, use node, or translate tests to something that has a more
> useful environment.
>
> I've been noodling over whether we might be better off to just start
> translating everything to Python or something. I've seen suggestions
> for Erlang but I personally think Erlang is a terrible language for
> writing tests like this (specifically, the code to test ratio is
> ungood). If we had something like Python to hack on then I was also
> thinking of writing a library function that would start CouchDB as a
> slave process which then would remove the need to have the _restart
> handler because you could just kill -9 the subprocess and restart it
> with maybe a wait for when things boot again.
>
> I reviewed your feature branch the other day and I'm +1 for pushing
> that to master.
>
> Awesome work, Wendall.
>
> On Fri, Apr 5, 2013 at 6:57 PM, Wendall Cada <wendallc@apache.org> wrote:
>> I wanted to follow up on this.
>>
>> I've created a feature branch for this and a JIRA issue
>> https://issues.apache.org/jira/browse/COUCHDB-1762
>>
>> Overall, I think the worst problem is that the tests really aren't
>> debuggable in any sane way, and logging is essentially useless for most
>> things. The only sure way to spot an error most of the time is if it's an
>> actual CouchDB bug and shows up in the log. I'm not sure how this can ever
>> be fixed with the current test suite. I'd opt for testing with jasmine, but
>> that would require not using couchjs for the test runner, so for now, I just
>> focused on getting random failures under control.
>>
>> Paul was kind enough to share some code that he wrote recently to deal with
>> the rampant _restart issues.
>> https://github.com/davisp/couchdb/commit/0cbf6a9cea01eea599524bcdb77dedb322c7ade4
>> This is a very sound approach in using a token so you can see if it actually
>> restarts. The current test suite can result in false positives very easily,
>> which leads to test failures. I think this is probably the biggest reason
>> for the random failures. In a previous IRC conversation with Bob (rnewson),
>> Jan and I think Benoit (sorry if not the case) _restart was deemed something
>> that should go away. I filed a ticket for it's removal
>> https://issues.apache.org/jira/browse/COUCHDB-1714, and as Bob points out in
>> the comments, this is useful for the test suite. I'd argue it's only useful
>> with Paul's patch adding a token. Otherwise, it's just not reliable at all.
>>
>> For the branch I created, instead of using _restart, I did some bash magic
>> with a pipe and stop/start the process through the local run script. This
>> has the same drawback of not knowing if CouchDB restarted, or we just got a
>> false positive. To account for this, I put a small delay in the execution of
>> the lookup, using a new method isRunning to give a little time to stop.
>>
>> I also changed the suite to run a new couchjs for each test file. I'm not
>> certain at this point that this is even necessary at all, but I still think
>> it's safer in case of a crash, since the rest of the suite can continue.
>>
>> Other changes I made were just timing related in running the test suite for
>> spinning disks, and a couple bug fixes in individual tests.
>>
>> The lack of timers makes writing these tests very ugly. I really dislike
>> this, but so long as the test suite needs couchjs, I don't see a way to
>> avoid this without implementing our own setInterval method in C.
>>
>> One last item. I was getting a consistent failure in Centos 6. I tracked
>> this down to a bug in libcurl. For some reason, after any xhr request that
>> returns a 416, the very next send() will hang for a long time, then
>> eventually crash couchjs. The specific version causing the issue is
>> curl-7.19.7-35.el6 and libcurl-7.19.7-35.el6. I'm not certain if this is
>> worth reporting in JIRA, but it will certainly cause a test suite failure
>> consistently in attachment_ranges, but otherwise appears to be fairly
>> harmless. Maybe this should be documented somewhere?
>>
>> Wendall
>>
>>
>> On 03/27/2013 02:05 PM, Wendall Cada wrote:
>>> In 1.3.0, there is a new part of the test suite to run the javascript
>>> tests from the command line. I'm running into various issues on different
>>> hardware/OS configurations. Mostly, tests hanging or timing out and failing.
>>> These are really hard to troubleshoot, as they all pass just fine if run
>>> individually.
>>>
>>> What I'm experimenting with today is rewriting how the tests are
>>> implemented to be run one at a time from a loop in bash, versus a loop in
>>> javascript. I think the failures I'm running into are improper
>>> setup/teardown. There may be an issue with rapid delete and adding a db, or
>>> rapidly starting and stopping couchdb, but I think this is not what's
>>> happening in my failures.
>>>
>>> The nature of spidermonkey doesn't allow for spawning threads, or
>>> sandboxing, etc, so it's hard looking at the test suite to see how I can
>>> improve running all tests. I think it's far better to have the setup spawn a
>>> new interpreter for each test. Tear down will kill the interpreter.
>>>
>>> Wendall
>>