Return-Path: X-Original-To: apmail-commons-dev-archive@www.apache.org Delivered-To: apmail-commons-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3FBF84525 for ; Tue, 17 May 2011 19:27:20 +0000 (UTC) Received: (qmail 16802 invoked by uid 500); 17 May 2011 19:27:19 -0000 Delivered-To: apmail-commons-dev-archive@commons.apache.org Received: (qmail 16701 invoked by uid 500); 17 May 2011 19:27:19 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 16693 invoked by uid 99); 17 May 2011 19:27:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2011 19:27:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of phil.steitz@gmail.com designates 74.125.83.171 as permitted sender) Received: from [74.125.83.171] (HELO mail-pv0-f171.google.com) (74.125.83.171) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2011 19:27:12 +0000 Received: by pva4 with SMTP id 4so406879pva.30 for ; Tue, 17 May 2011 12:26:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=Y0GWLAVmV6FEIqnFdUKmH/GxDyk6j5c84CaLCJUSXg4=; b=ef3BKULXWZS7f2h15gJeAzs2SHXRmBrrJyqvPMS6Ajj4bhps/jcddbcqnF26sAWLX7 OsFK4cNSE+Bz//Dio+QdEPrzZPhJG+z+JAdmQhXW62j29uSW/WiEf8GWROrX96J38VCZ ZnR+zRpuoqnwU85Vwj03ZfWhsSRot2eVqiVpk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=otU3XkkOpvwh8ZBjqz59a8buqPGfq3DRd9tfXZvB8lXyBTEFPCx9ZrfMU0lUCjxaUz 1tQGheVAyJ6xzpi4RHYkhrmMcz62c1a4qrbvsZDVnaVJzb2imUbWMljOsAqF6DdXeFCY 0hXF3AbV2mbFzzUrO0iYdQKLhI2f2F+2UlD84= Received: by 10.142.136.19 with SMTP id j19mr588919wfd.167.1305660411512; Tue, 17 May 2011 12:26:51 -0700 (PDT) Received: from a.local (75-171-19-46.phnx.qwest.net [75.171.19.46]) by mx.google.com with ESMTPS id s39sm717829wfc.16.2011.05.17.12.26.40 (version=SSLv3 cipher=OTHER); Tue, 17 May 2011 12:26:41 -0700 (PDT) Message-ID: <4DD2CBEF.1090803@gmail.com> Date: Tue, 17 May 2011 12:26:39 -0700 From: Phil Steitz User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Commons Developers List Subject: Re: [math] [GUMP@vmgump]: Project commons-math (in module apache-commons) failed References: <2113281926.1952921305620535208.JavaMail.root@spooler6-g27.priv.proxad.net> In-Reply-To: <2113281926.1952921305620535208.JavaMail.root@spooler6-g27.priv.proxad.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 5/17/11 1:22 AM, luc.maisonobe@free.fr wrote: > ----- "Phil Steitz" a écrit : > >> On 5/16/11 3:47 PM, Gilles Sadowski wrote: >>> On Mon, May 16, 2011 at 02:39:01PM -0700, Phil Steitz wrote: >>>> On 5/16/11 3:44 AM, Dr. Dietmar Wolz wrote: >>>>> Nikolaus Hansen, Luc and me discussed this issue in Toulouse. >>> Reading that, I've been assuming that... >>> >>>>> We have two options to handle this kind of failure in tests of >> stochastic >>>>> optimization algorithms: >>>>> 1) fixed random seed - but this reduces the value of the test >>>>> 2) Using the RetryRunner - preferred solution >>>>> >>>>> @Retry(3) should be sufficient for all tests. >>>>> >>>> The problem with that is that it is really equivalent to just >>>> reducing the sensitivity of the test to sensitivity^3 (if, e.g, >> the >>>> test will pick up anomalies with stochastic probability of less >> than >>>> alpha as is, making it retry three times really just reduces that >>>> sensitivity to alpha^3). This (my statement above) is not quite correct, or at least whether it is correct or not depends on the problem. While the failure probabilities may be the same for three retries vs. one with lower sensitivity, the results mean different things and the first is generally more likely to indicate a change-related problem. Sorry for this mistake. >>>> I think the right answer here is to find >>>> out why the test is failing with higher than, say .001 probability >>>> and fix the underlying problem. If the test itself is too >>>> sensitive, then we should fix that. Then switch to a fixed seed >> for >>>> the released code, reverting to random seeding when the code is >>>> under development. >>> ... they had settled on the best approach for the class at hand. >> Whatever rationale was discussed should be summarized here, on the >> public list. > We did not looked at the code itself when we met, but rather spoke about stochastic > tests at large. Nikolaus said using an optimization algorithm as a black box is > clearly not a good thing, Dietmar said stochastics tests are useful and may fail > sometimes, and I said unit tests in a continuous integration process are needed > and should not fail randomly. All these statements are true I think, they only differ > as they look at the problem from a different point of view. It was basically the > same thing we already said on the list some months ago about the statistics > tests, when we finally choose to set up a retry procedure (was it for Chi square or > for Pascal distribution ?). These have pretty much all been removed. I think the RandomDataImpl tests are the only ones that still use retries. I was planning to remove those as well. > There is unfortunately no perfect answer. We talked about both the fixed seed approach > and the retry procedure, and Dietmar did not like the fixed seed, so we chose the other > one. > > >From old memories, I think Ted proposed something different about generating random > numbers that was used in Mahout. Ted, could you explain us again what you proposed ? I won't speak for Ted, but IIRC, he was the first to advocate fixed seeds. After thinking more about the problem, I think the best approach is random seeds during development changed to fixed prior to release. In the random data generation tests, we can state and control precisely the probability that a test will fail randomly. When working on the code, its best to set this fairly low and use random seeds. Generally, when you screw something up, failures will happen consistently. Even with a fixed seed and p(false positive) = .0001, the tests fail pretty reliably when something is broken. So the approach above works well for these. I guess the random seed + retry approach will also in general work here; but this is going to depend on the problem and it makes the sensitivities harder to set and understand in general. And while it drives down the probability of spurious failure, it does not eliminate it. Do we have any way to bound or estimate the expected probability failure of the optimization tests? Can we relate these estimates to expected errors in returned values? What bothers me about just setting a retry number is that a) there may be an underlying problem that is being masked and b) if there is any way that we can estimate the likelihood of bad results, we should document that. Phil >>> [I.e. we had raised the possibility that there could a bug in the >> code that >>> triggered test failures, but IIUC they now concluded that the code >> is fine >>> and that failures are expected to happen sometimes.] >> I would like to understand better why that is the case. If failures >> happen sometimes in test, does that means that bad results are >> expected to be returned sometimes? If so, have we documented that? >> >>> It still seems strange that it is always the same 2 tests that >> fail. >>> Is there an explanation to this behaviour, that we might add as a >> comment >>> in the test code? >> I agree here, and possibly in the javadoc for the application code. >> If the code is prone to generating spurious results sometimes, we >> need to make that clear in the javadoc. > It really depends on the function you optimize, with or without local > minima. Perhaps this test case is for a known difficult problem, I didn't > look at this. > > Luc > >> Phil >>> Gilles >>> >>> >> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>> For additional commands, e-mail: dev-help@commons.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >> For additional commands, e-mail: dev-help@commons.apache.org > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org > For additional commands, e-mail: dev-help@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org