Return-Path: X-Original-To: apmail-db-derby-dev-archive@www.apache.org Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D3B3904D for ; Thu, 9 Feb 2012 12:02:10 +0000 (UTC) Received: (qmail 51841 invoked by uid 500); 9 Feb 2012 12:02:08 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 51003 invoked by uid 500); 9 Feb 2012 12:02:07 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 50990 invoked by uid 99); 9 Feb 2012 12:02:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Feb 2012 12:02:06 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [148.87.113.117] (HELO rcsinet15.oracle.com) (148.87.113.117) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Feb 2012 12:01:59 +0000 Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by rcsinet15.oracle.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id q19C1at8026820 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 9 Feb 2012 12:01:38 GMT Received: from acsmt357.oracle.com (acsmt357.oracle.com [141.146.40.157]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id q19C1arn026171 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 9 Feb 2012 12:01:36 GMT Received: from abhmt105.oracle.com (abhmt105.oracle.com [141.146.116.57]) by acsmt357.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id q19C1aM8028768 for ; Thu, 9 Feb 2012 06:01:36 -0600 Received: from localhost (/10.172.139.141) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 09 Feb 2012 04:01:35 -0800 From: Knut Anders Hatlen To: Subject: Re: SpawnedProcess arguments and behavior References: <4F33A1E4.7040201@oracle.com> Mail-Copies-To: never Mail-Followup-To: Date: Thu, 09 Feb 2012 13:01:33 +0100 In-Reply-To: <4F33A1E4.7040201@oracle.com> (Kristian Waagan's message of "Thu, 09 Feb 2012 11:37:24 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (usg-unix-v) MIME-Version: 1.0 Content-Type: text/plain X-Source-IP: acsinet21.oracle.com [141.146.126.237] X-CT-RefId: str=0001.0A090206.4F33B5A2.00EA,ss=1,re=0.000,fgs=0 Kristian Waagan writes: > Hi, > > I've been looking a bit at SpawnProcess, and I'm planning to do some > changes to it. The most important change is make > BaseTestCase.readProcessOutput use the class, since reading the output > from the subprocess requires extra code that should be isolated to one > location. There is reason to believe a problem with readProcessOutput > is the cause of the interrupt-related errors reported recently by > Myrna and, possibly, Kathey. > > What's troubling me are the arguments destroy and timeout, especially > the combination of the two. > For me, a timeout implies destroy == true. Specifying a timeout and > setting destroy to false is effectively the same as setting destroy to > true, since destroy will be forced to true when a timeout occurs. Agreed. I think the use case is to be able to forcefully quit a process immediately (it's used this way only in NetworkServerTestSetup, I think). We probably need to preserve that functionality, but it's probably less confusing if we have one method for immediate destruction (with no parameters) and one with a timeout (and no destroy parameter). > For automated test runs it would be best if complete() always returns, > although many test framworks have mechanisms to kill the main process > if it takes too long. For debugging it may be best to keep the > subprocess running and the main process hanging to allow for > inspection. I think it should be possible to obtain the stack (java > stack or native stack) of the subprocess, then kill it manually to get > stdout/stderr and have the main process continue. > > I'd prefer to settle on one of two approaches, since that would > simplify the code and define a consistent behavior: > a) Never destroy the process. > b) Always destroy the process if hanging for more than a default > amount of time. > > Opinions? Option a is of course the easier one to implement. Is it possible to get the stack of the sub-process in a portable way with option b? If I understand correctly, the suggestion is to always have a timeout when calling complete(), right? That sounds reasonable to me, provided that the timeout is high enough to avoid errors when the termination of the sub-process just happens to be slow. However, I think most of the times we've seen hangs involving sub-processes, they've been caused by some kind of deadlock in the communication between the main test process and the sub-process (typically both processes waiting for output from the other one). In those cases, the test never gets as far as to calling complete(), and a timeout in complete() wouldn't help. To address those cases, SpawnedProcess might need a timeout mechanism that automatically destroys the process if it has lived too long. But then the default timeout must be very high, since it must account for the time it takes to run the test case, not just the time it takes to shutdown the process after completion of the test, and we don't want the timeout to cause problems on slow machines. -- Knut Anders