Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14882F21D for ; Fri, 5 Apr 2013 23:57:57 +0000 (UTC) Received: (qmail 2984 invoked by uid 500); 5 Apr 2013 23:57:56 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 2936 invoked by uid 500); 5 Apr 2013 23:57:56 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 2919 invoked by uid 99); 5 Apr 2013 23:57:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 23:57:56 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wendallc@83864.com designates 209.85.210.52 as permitted sender) Received: from [209.85.210.52] (HELO mail-da0-f52.google.com) (209.85.210.52) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 23:57:52 +0000 Received: by mail-da0-f52.google.com with SMTP id f10so1778297dak.25 for ; Fri, 05 Apr 2013 16:57:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=83864.com; s=google; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=+3mdBLNleHWPuHbLfeNYq5NBRs9N/m8heADhy1nUrXY=; b=aMomfDemMTXb9f1mCf7qO/gerUVlZiCJaSkZCknK5UWkre1HeyJL6Hcz4StGFdgoej knLy6UqcDEczRQeOS4aG0nzzcKbDLqG5Yu3+zbjbtFDM+rstOpe2sT6SAz3qVMx/Z9Nj ev0T5FUJkGMy4IspECWSHkMpDrl95VQlisTuc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding:x-gm-message-state; bh=+3mdBLNleHWPuHbLfeNYq5NBRs9N/m8heADhy1nUrXY=; b=H4PUJGp8QJWpQb0XqAsYG8bICX+FQEpYoy/Jksk3u+xI//5Q36ClkoKOqO7eE4/6F5 yo2Qt622Ol1ZqqTtH2uRu+iv1ulL787gko3FR4KKU/xpPej5dtuYccuFDDW7KE5Zh4Yx znl0BdsvohSt8bQjyrCN/h3+4oYf07WQxw9/2BT7mw5npTJF+XCqPzRwOsKTjfqNmHx0 BIArNNPhSNAsTNtojV/xA3noyacxV+wcoKTgReBP0DPE0pStOw/oEGPEF/VmezVpt+4s c4speQBSbTgKjzxXpiOpIw56S65KWPOh87DS7nR/J51jACJhXWb9rgU2m/N1slGDvtFX pqTw== X-Received: by 10.66.241.106 with SMTP id wh10mr17524366pac.143.1365206252000; Fri, 05 Apr 2013 16:57:32 -0700 (PDT) Received: from wlaptop.localdomain (c-67-170-132-85.hsd1.or.comcast.net. [67.170.132.85]) by mx.google.com with ESMTPS id f4sm16160244pbc.6.2013.04.05.16.57.29 (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 05 Apr 2013 16:57:30 -0700 (PDT) Sender: Wendall Cada Message-ID: <515F64E9.9050207@apache.org> Date: Fri, 05 Apr 2013 16:57:29 -0700 From: Wendall Cada User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 To: dev@couchdb.apache.org Subject: Re: Javascript Test Suite References: <51535F0B.7060408@apache.org> In-Reply-To: <51535F0B.7060408@apache.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQm+WGHEIsPBRU6qjiYvQt5o2HSmn8trcxb/IXn+C6fAQNbXUc3JtHyxn48PIYyXnhFCp0FW X-Virus-Checked: Checked by ClamAV on apache.org I wanted to follow up on this. I've created a feature branch for this and a JIRA issue https://issues.apache.org/jira/browse/COUCHDB-1762 Overall, I think the worst problem is that the tests really aren't debuggable in any sane way, and logging is essentially useless for most things. The only sure way to spot an error most of the time is if it's an actual CouchDB bug and shows up in the log. I'm not sure how this can ever be fixed with the current test suite. I'd opt for testing with jasmine, but that would require not using couchjs for the test runner, so for now, I just focused on getting random failures under control. Paul was kind enough to share some code that he wrote recently to deal with the rampant _restart issues. https://github.com/davisp/couchdb/commit/0cbf6a9cea01eea599524bcdb77dedb322c7ade4 This is a very sound approach in using a token so you can see if it actually restarts. The current test suite can result in false positives very easily, which leads to test failures. I think this is probably the biggest reason for the random failures. In a previous IRC conversation with Bob (rnewson), Jan and I think Benoit (sorry if not the case) _restart was deemed something that should go away. I filed a ticket for it's removal https://issues.apache.org/jira/browse/COUCHDB-1714, and as Bob points out in the comments, this is useful for the test suite. I'd argue it's only useful with Paul's patch adding a token. Otherwise, it's just not reliable at all. For the branch I created, instead of using _restart, I did some bash magic with a pipe and stop/start the process through the local run script. This has the same drawback of not knowing if CouchDB restarted, or we just got a false positive. To account for this, I put a small delay in the execution of the lookup, using a new method isRunning to give a little time to stop. I also changed the suite to run a new couchjs for each test file. I'm not certain at this point that this is even necessary at all, but I still think it's safer in case of a crash, since the rest of the suite can continue. Other changes I made were just timing related in running the test suite for spinning disks, and a couple bug fixes in individual tests. The lack of timers makes writing these tests very ugly. I really dislike this, but so long as the test suite needs couchjs, I don't see a way to avoid this without implementing our own setInterval method in C. One last item. I was getting a consistent failure in Centos 6. I tracked this down to a bug in libcurl. For some reason, after any xhr request that returns a 416, the very next send() will hang for a long time, then eventually crash couchjs. The specific version causing the issue is curl-7.19.7-35.el6 and libcurl-7.19.7-35.el6. I'm not certain if this is worth reporting in JIRA, but it will certainly cause a test suite failure consistently in attachment_ranges, but otherwise appears to be fairly harmless. Maybe this should be documented somewhere? Wendall On 03/27/2013 02:05 PM, Wendall Cada wrote: > In 1.3.0, there is a new part of the test suite to run the javascript > tests from the command line. I'm running into various issues on > different hardware/OS configurations. Mostly, tests hanging or timing > out and failing. These are really hard to troubleshoot, as they all > pass just fine if run individually. > > What I'm experimenting with today is rewriting how the tests are > implemented to be run one at a time from a loop in bash, versus a loop > in javascript. I think the failures I'm running into are improper > setup/teardown. There may be an issue with rapid delete and adding a > db, or rapidly starting and stopping couchdb, but I think this is not > what's happening in my failures. > > The nature of spidermonkey doesn't allow for spawning threads, or > sandboxing, etc, so it's hard looking at the test suite to see how I > can improve running all tests. I think it's far better to have the > setup spawn a new interpreter for each test. Tear down will kill the > interpreter. > > Wendall