commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <Luc.Maison...@free.fr>
Subject Re: [math] Re: Longley Data
Date Fri, 15 Jul 2011 06:56:31 GMT
Le 15/07/2011 02:37, Greg Sterijevski a écrit :
> The usual issues with numerical techniques, how you calculate (c * x + d *
> y)/e matters...
> It turns out that religiously following the article and defining c_bar  = c
> / e is not a good idea.
>
> The Filippelli data is still a bit dicey. I would like to resolve where the
> error is accumulating there as well. That's really the last thing preventing
> me from sending the patch with the Miller-Gentlemen Regression to Phil.

I don't know whether this is feasible in your case, but when trying to 
find this kind of numerical errors, I found useful to just redo the 
computation in parallel to high precision. Up to a few months ago, I was 
simply doing this using emacs (yes, emacs rocks) configured with 50 
significant digits? Now it is easier since we have our own dfp package 
in [math].

Luc

>
> -Greg
>
> On Thu, Jul 14, 2011 at 1:18 PM, Ted Dunning<ted.dunning@gmail.com>  wrote:
>
>> What was the problem?
>>
>> On Wed, Jul 13, 2011 at 8:33 PM, Greg Sterijevski<gsterijevski@gmail.com
>>> wrote:
>>
>>> Phil,
>>>
>>> Got it! I fit longley to all printed values. I have not broken
>> anything...
>>> I
>>> need to type up a few loose ends, then I will send a patch.
>>>
>>> -Greg
>>>
>>> On Tue, Jul 12, 2011 at 2:35 PM, Phil Steitz<phil.steitz@gmail.com>
>>> wrote:
>>>
>>>> On 7/12/11 12:12 PM, Greg Sterijevski wrote:
>>>>> All,
>>>>>
>>>>> So I included the wampler data in the test suite. The interesting
>>> thing,
>>>> is
>>>>> to get clean runs I need wider tolerances with OLSMultipleRegression
>>> than
>>>>> with the version of the Miller algorithm I am coding up.
>>>> This is good for your Miller impl, not so good for
>>>> OLSMultipleRegression.
>>>>> Perhaps we should come to a consensus of what good enough is? How
>> close
>>>> do
>>>>> we want to be? Should we require passing on all of NIST's 'hard'
>>>> problems?
>>>>> (for all regression techniques that get cooked up)
>>>>>
>>>> The goal should be to match all of the displayed digits in the
>>>> reference data.  When we can't do that, we should try to understand
>>>> why and aim to, if possible, improve the impls.   As we improve the
>>>> code, the tolerances in the tests can be improved.  Characterization
>>>> of the types of models where the different implementations do well /
>>>> poorly is another thing we should aim for (and include in the
>>>> javadoc).  As with all reference validation tests, we need to keep
>>>> in mind that a) the "hard" examples are designed to be numerically
>>>> unstable and b) conversely, a handful of examples does not really
>>>> demonstrate correctness.
>>>>
>>>> Phil
>>>>> -Greg
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>
>>>>
>>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message