Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BF7609617 for ; Wed, 2 May 2012 09:06:03 +0000 (UTC) Received: (qmail 71648 invoked by uid 500); 2 May 2012 09:06:02 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 71554 invoked by uid 500); 2 May 2012 09:06:02 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 71534 invoked by uid 99); 2 May 2012 09:06:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 May 2012 09:06:01 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dawid.weiss@gmail.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 May 2012 09:05:55 +0000 Received: by ghbz10 with SMTP id z10so517297ghb.35 for ; Wed, 02 May 2012 02:05:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type :content-transfer-encoding; bh=0sjbYz5xlZ5tl08JjhNlIJ9AOcKblL2AZn+EKkOJ3Kc=; b=hhR3eOjRlFAp960ZO8Mr34V4NlXl7jEmqsg71JiDgmCZuTcHfVWFqOYBBNj6hq8yJ3 2e/tyFEhgX/jeEJzRGAKJLgx65hwx79JhC48xQh2ejnUspQpPz2hYIApV05S+oPat6me 5MMeQwxPiuGZhHmNpX8Gde8bD6cQcGxTbk3oJQeHEUMXIvoGvjD164JuBsB5xjh5RYb7 iqz30HPOjpr2XVg5kWy8vDuho9OLXAAM7hFm3RlzY/jVJNDNycU5K82RHHkz8136M2ve 6/TgM6Q+GUx/fEMYx5jzS5coKA9prO9xtcyW/7tYVgRW3+Dq8AjKYGyMNsMb4NeeSTAw QuyQ== Received: by 10.42.202.80 with SMTP id fd16mr642843icb.6.1335949534439; Wed, 02 May 2012 02:05:34 -0700 (PDT) MIME-Version: 1.0 Sender: dawid.weiss@gmail.com Received: by 10.42.171.197 with HTTP; Wed, 2 May 2012 02:05:14 -0700 (PDT) In-Reply-To: References: From: Dawid Weiss Date: Wed, 2 May 2012 11:05:14 +0200 X-Google-Sender-Auth: Q_IEbYkkFB2EIinE0dnRmLJ6MZo Message-ID: Subject: Re: hudson hung To: dev@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org > Hmm, as a test, I tried adding @Timeout(millis=3D100000) to > LuceneTestCase, ie 100 seconds, which I think should not trigger on > any core tests today. > > That should then apply to all subclasses of LuceneTestCase right? =C2=A0I= e > all tests will be aborted after 100 seconds... Yes, it should abort each test after 100 seconds (or not if it completes earlier). > But, something is wrong: I get lots of quick (ie much less than 100 > seconds) failures like this: > > =C2=A0 [junit4] Suite: org.apache.lucene.util.TestRamUsageEstimatorOnWild= Animals > =C2=A0 [junit4] ERROR =C2=A0 0.00s J2 | TestRamUsageEstimatorOnWildAnimal= s (suite) > =C2=A0 [junit4] =C2=A0 =C2=A0> Throwable #1: java.lang.RuntimeException: = Interrupted > while waiting for worker? Weird. I'm not sure what's wrong and I'm not able to tell you right now (vacation...). I will look into this once I come back. The problem of "stopping" a running test is not trivial because there are no easy ways of telling which threads have been started from a test, which threads should outlive the test (as in a threadpool initialized from @BeforeClass but used from within a test -- the pool threads should outlive it)... finally there are cases when a thread just won't make it possible to be killed by an external threads (see test cases in randomizedtesting -- look for zombie tests if you're curious). There are many sneaky scenarios. The timeout on jenkins is also not trivial I think. Uwe was reporting that stopping a build on jenkins leaves forked jvms running in the background. I looked into jenkins source code to see how they're killing those running processes... it's really hairy -- os-dependent and doing scans of /proc... brr... Apparently also not an easy thing to do. I'll upgrade junit4 and randomizedtesting to version 1.4.0 which should fix a few problems related to event passing but I'll have to postpone timeouts until I'm back at work full time. Dawid --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org