commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] Validation of special functions
Date Sat, 20 Oct 2012 17:37:12 GMT
On 10/19/12 7:43 AM, Sébastien Brisard wrote:
> Hi Phil,
> thanks for your answer.
>
> 2012/10/19 Phil Steitz <phil.steitz@gmail.com>:
>> On 10/18/12 11:13 PM, Sébastien Brisard wrote:
>>> Hi,
>>> I have recently started to update the users guide on special
>>> functions, providing accuracy measurements of our implementations.
>>> To get these figures, I carried out extensive comparisons with
>>> reference values computed with Maxima, and 128 decimal digits.
>>> I also wrote a Java command-line app to automatize the process.
>>> Briefly speaking, the reference values computed with Maxima are saved
>>> as a binary file
>>>   - for a unary function f(x), the data is stored as follows
>>>     x[0], f(x[0]), x[1], f(x[1]), ...
>>>   - for a binary function f(x, y), the data is stored as follows
>>>     x[0], y[0], f(x[0], y[0]), x[1], y[1], f(x[1], y[1]), ...
>>>   - and similar storage pattern for a n-ary function.
>>>
>>> The signature of the function to be tested can be arbitrary, provided
>>> all its arguments are of primitive type: the app will manage to read
>>> the reference values.
>>> The app then computes for each t-uple (x[i], y[i], ...) the
>>> Commons-Math value of f(x[i], y[i], ...) and the error in ulps. This
>>> error is transmitted to a SummaryStatistics, which is printed on the
>>> standard output when the end of the input file is reached.
>>> The app also writes a binary output file, where the data is stored as follows
>>> x[0], y[0], reference value of f(x[0], y[0]), CM value of f(x[0],
>>> y[0]), error in ulps, ...
>>>
>>> this binary file can then be plotted if necessary in order to locate
>>> the areas where the accuracy is at its worst.
>>>
>>> The app takes a properties file as input, here is an example
>>>
>>> method=org.apache.commons.math3.special.Gamma.logGamma
>>> signature=double
>>> inputFileMask=logGamma-%02d.dat
>>> outputFileMask=logGamma-out-%02d.dat
>>> from=1
>>> to=5
>>> by=1
>>>
>>> The "method" key is the fully qualified name to the function to be
>>> validated. Requirements on this function are
>>>   - static
>>>   - returns double
>>>   - takes only primitive arguments
>>>
>>> The "signature" key is necessary to distinguish between functions with
>>> same name. In case there  are multiple arguments, the value of this
>>> key should read e.g. "double, double"
>>>
>>> "inputFileMask" and "outputFileMask" are the file names of the input
>>> and output binary files. In order to be able to handle multiple files
>>> in a row, indexed file names can be used, the format for the indexed
>>> file names must then follow the syntax of String.format().
>>> "from" is the value of the first index (inclusive), "to" is the value
>>> of the last index (exclusive), "by" is the increment.
>>>
>>> This app is very simple, but it could prove useful to anyone
>>> implementing a new special function in CM. Therefore, I was wondering
>>> what would be the best way to include it in our library. Also, I would
>>> like people to be able to check all the figures I state on the
>>> website. Therefore, I would like to provide all the reference data
>>> I've used so far (I have more in store, not yet used to update the
>>> users guide). As previously discussed, I gave up binary files for unit
>>> tests, which are just "safety guards" to check whether or not the
>>> implementation is totally wrong. However, for this extensive analysis
>>> of the accuracy, I thought it was better to stick with binary data
>>> files.
>>>
>>> What do you think? Do you think that this app should be provided to
>>> all? Same question for reference data files [1]?
>>>
>>> Thanks for your comments,
>>> Sébastien
>>>
>>> [1] For reference data files, maybe providing the Maxima scripts (and
>>> the properties files) would suffice.
>> I would go ahead and commit all of this stuff in test/maxima,
>> similarly to what we have now for R.
>>
> Fine with me. However, I will not maintain the same coding standards
> in the source file of the app as in the source files of CM proper. In
> other words: there is no Javadoc, and I'd rather not write one for
> lack of time. Actually, the source is (I believe) fairly readable as
> is.
> Do you think that would be all right? (Provided I include a readme.txt
> indicating how the whole stuff works, of course).
>
>> It would be great to also have
>> similar tests using R, as R is freely available OSS and having two
>> different comparison impls would help avoid "we both made the same
>> mistake" issues or false negatives.  Make sure to include a text
>> file in the top level test/maxima directory that describes how
>> everything works, so others can add patches.  Thanks for doing this.
>>
> Maxima is also freely available, and I highly recommend it for
> symbolic calcs! (My whole PhD has been done with this software, while
> colleagues of mine are still using expensive mapple...)

Sweet!  You keep teaching me about great OSS products :) Thanks!
> I'm OK doing some tests with R, but how do you intend to interpret the
> results ? I'm not aware of any multiprecision packages in R (please
> correct me if I'm wrong), in which case, any special function
> implemented in R is bound to have some finite accuracy. How do you
> assess the accuracy of the Java impl
> On the other hand, Maxima can work with arbitrary precision, and I'm
> therefore able to find the "exact" double representation of the value
> f(x) I'm trying to assess.
> I'm not sure I'm perfectly clear...
>
> I think (unless arbitrary precision is possible) that R could be used
> for testing purposes, not for assesment of the actual accuracy.

I don't think arbitrary precision is built-in to R; but there is an
add-on packages mpfr that claims to support this.  I have not tried
it.  By specifying digits to display in R, you can effectively test
up to what can be represented in a double, which has picked up some
differences between our impls and R in a few places.  Having a third
references for the special functions to even just limited precision
would still be good if others are motivated to contribute.  In any
case, what you have done with Maxima is a big step forward.

Phil
>
> I agree with you we should double check with another software, but we
> need a multi-precision package with special functions implemented. I
> don't know many (non-commercial) of them.
>
> What do you think?
> Sébastien
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message