From dev-return-127440-apmail-commons-dev-archive=commons.apache.org@commons.apache.org Fri Jul 15 06:57:49 2011 Return-Path: X-Original-To: apmail-commons-dev-archive@www.apache.org Delivered-To: apmail-commons-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A47356159 for ; Fri, 15 Jul 2011 06:57:49 +0000 (UTC) Received: (qmail 49596 invoked by uid 500); 15 Jul 2011 06:57:43 -0000 Delivered-To: apmail-commons-dev-archive@commons.apache.org Received: (qmail 48944 invoked by uid 500); 15 Jul 2011 06:57:14 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 48920 invoked by uid 99); 15 Jul 2011 06:57:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 06:57:05 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [80.67.169.19] (HELO solo.fdn.fr) (80.67.169.19) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 06:56:55 +0000 Received: from lehrin (reverse-229.fdn.fr [80.67.176.229]) by smtp.fdn.fr (Postfix) with ESMTP id 64979444C6 for ; Fri, 15 Jul 2011 08:56:34 +0200 (CEST) Received: by lehrin (Postfix, from userid 5001) id C3EF14073; Fri, 15 Jul 2011 08:56:33 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on lehrin.local X-Spam-Level: * Received: from lehrin.local (lehrin.local [127.0.0.1]) by lehrin (Postfix) with ESMTP id 9A0724070 for ; Fri, 15 Jul 2011 08:56:31 +0200 (CEST) Message-ID: <4E1FE49F.6020801@free.fr> Date: Fri, 15 Jul 2011 08:56:31 +0200 From: Luc Maisonobe User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110419 Thunderbird/3.1.9 MIME-Version: 1.0 To: Commons Developers List Subject: Re: [math] Re: Longley Data References: <4E1BE13B.2030405@gmail.com> <4E1C68BE.7070109@gmail.com> <4E1C783B.4000006@gmail.com> <4E1CA1F8.30201@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=1.5 required=5.0 tests=ALL_TRUSTED,FREEMAIL_FROM, FREEMAIL_REPLY autolearn=no version=3.3.1 Le 15/07/2011 02:37, Greg Sterijevski a écrit : > The usual issues with numerical techniques, how you calculate (c * x + d * > y)/e matters... > It turns out that religiously following the article and defining c_bar = c > / e is not a good idea. > > The Filippelli data is still a bit dicey. I would like to resolve where the > error is accumulating there as well. That's really the last thing preventing > me from sending the patch with the Miller-Gentlemen Regression to Phil. I don't know whether this is feasible in your case, but when trying to find this kind of numerical errors, I found useful to just redo the computation in parallel to high precision. Up to a few months ago, I was simply doing this using emacs (yes, emacs rocks) configured with 50 significant digits? Now it is easier since we have our own dfp package in [math]. Luc > > -Greg > > On Thu, Jul 14, 2011 at 1:18 PM, Ted Dunning wrote: > >> What was the problem? >> >> On Wed, Jul 13, 2011 at 8:33 PM, Greg Sterijevski>> wrote: >> >>> Phil, >>> >>> Got it! I fit longley to all printed values. I have not broken >> anything... >>> I >>> need to type up a few loose ends, then I will send a patch. >>> >>> -Greg >>> >>> On Tue, Jul 12, 2011 at 2:35 PM, Phil Steitz >>> wrote: >>> >>>> On 7/12/11 12:12 PM, Greg Sterijevski wrote: >>>>> All, >>>>> >>>>> So I included the wampler data in the test suite. The interesting >>> thing, >>>> is >>>>> to get clean runs I need wider tolerances with OLSMultipleRegression >>> than >>>>> with the version of the Miller algorithm I am coding up. >>>> This is good for your Miller impl, not so good for >>>> OLSMultipleRegression. >>>>> Perhaps we should come to a consensus of what good enough is? How >> close >>>> do >>>>> we want to be? Should we require passing on all of NIST's 'hard' >>>> problems? >>>>> (for all regression techniques that get cooked up) >>>>> >>>> The goal should be to match all of the displayed digits in the >>>> reference data. When we can't do that, we should try to understand >>>> why and aim to, if possible, improve the impls. As we improve the >>>> code, the tolerances in the tests can be improved. Characterization >>>> of the types of models where the different implementations do well / >>>> poorly is another thing we should aim for (and include in the >>>> javadoc). As with all reference validation tests, we need to keep >>>> in mind that a) the "hard" examples are designed to be numerically >>>> unstable and b) conversely, a handful of examples does not really >>>> demonstrate correctness. >>>> >>>> Phil >>>>> -Greg >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>>> For additional commands, e-mail: dev-help@commons.apache.org >>>> >>>> >>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org