commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Brisard <sebastien.bris...@m4x.org>
Subject Re: [math] Validation of special functions
Date Fri, 19 Oct 2012 14:43:11 GMT
Hi Phil,
thanks for your answer.

2012/10/19 Phil Steitz <phil.steitz@gmail.com>:
> On 10/18/12 11:13 PM, Sébastien Brisard wrote:
>> Hi,
>> I have recently started to update the users guide on special
>> functions, providing accuracy measurements of our implementations.
>> To get these figures, I carried out extensive comparisons with
>> reference values computed with Maxima, and 128 decimal digits.
>> I also wrote a Java command-line app to automatize the process.
>> Briefly speaking, the reference values computed with Maxima are saved
>> as a binary file
>>   - for a unary function f(x), the data is stored as follows
>>     x[0], f(x[0]), x[1], f(x[1]), ...
>>   - for a binary function f(x, y), the data is stored as follows
>>     x[0], y[0], f(x[0], y[0]), x[1], y[1], f(x[1], y[1]), ...
>>   - and similar storage pattern for a n-ary function.
>>
>> The signature of the function to be tested can be arbitrary, provided
>> all its arguments are of primitive type: the app will manage to read
>> the reference values.
>> The app then computes for each t-uple (x[i], y[i], ...) the
>> Commons-Math value of f(x[i], y[i], ...) and the error in ulps. This
>> error is transmitted to a SummaryStatistics, which is printed on the
>> standard output when the end of the input file is reached.
>> The app also writes a binary output file, where the data is stored as follows
>> x[0], y[0], reference value of f(x[0], y[0]), CM value of f(x[0],
>> y[0]), error in ulps, ...
>>
>> this binary file can then be plotted if necessary in order to locate
>> the areas where the accuracy is at its worst.
>>
>> The app takes a properties file as input, here is an example
>>
>> method=org.apache.commons.math3.special.Gamma.logGamma
>> signature=double
>> inputFileMask=logGamma-%02d.dat
>> outputFileMask=logGamma-out-%02d.dat
>> from=1
>> to=5
>> by=1
>>
>> The "method" key is the fully qualified name to the function to be
>> validated. Requirements on this function are
>>   - static
>>   - returns double
>>   - takes only primitive arguments
>>
>> The "signature" key is necessary to distinguish between functions with
>> same name. In case there  are multiple arguments, the value of this
>> key should read e.g. "double, double"
>>
>> "inputFileMask" and "outputFileMask" are the file names of the input
>> and output binary files. In order to be able to handle multiple files
>> in a row, indexed file names can be used, the format for the indexed
>> file names must then follow the syntax of String.format().
>> "from" is the value of the first index (inclusive), "to" is the value
>> of the last index (exclusive), "by" is the increment.
>>
>> This app is very simple, but it could prove useful to anyone
>> implementing a new special function in CM. Therefore, I was wondering
>> what would be the best way to include it in our library. Also, I would
>> like people to be able to check all the figures I state on the
>> website. Therefore, I would like to provide all the reference data
>> I've used so far (I have more in store, not yet used to update the
>> users guide). As previously discussed, I gave up binary files for unit
>> tests, which are just "safety guards" to check whether or not the
>> implementation is totally wrong. However, for this extensive analysis
>> of the accuracy, I thought it was better to stick with binary data
>> files.
>>
>> What do you think? Do you think that this app should be provided to
>> all? Same question for reference data files [1]?
>>
>> Thanks for your comments,
>> Sébastien
>>
>> [1] For reference data files, maybe providing the Maxima scripts (and
>> the properties files) would suffice.
>
> I would go ahead and commit all of this stuff in test/maxima,
> similarly to what we have now for R.
>
Fine with me. However, I will not maintain the same coding standards
in the source file of the app as in the source files of CM proper. In
other words: there is no Javadoc, and I'd rather not write one for
lack of time. Actually, the source is (I believe) fairly readable as
is.
Do you think that would be all right? (Provided I include a readme.txt
indicating how the whole stuff works, of course).

>
>It would be great to also have
> similar tests using R, as R is freely available OSS and having two
> different comparison impls would help avoid "we both made the same
> mistake" issues or false negatives.  Make sure to include a text
> file in the top level test/maxima directory that describes how
> everything works, so others can add patches.  Thanks for doing this.
>
Maxima is also freely available, and I highly recommend it for
symbolic calcs! (My whole PhD has been done with this software, while
colleagues of mine are still using expensive mapple...)
I'm OK doing some tests with R, but how do you intend to interpret the
results ? I'm not aware of any multiprecision packages in R (please
correct me if I'm wrong), in which case, any special function
implemented in R is bound to have some finite accuracy. How do you
assess the accuracy of the Java impl?
On the other hand, Maxima can work with arbitrary precision, and I'm
therefore able to find the "exact" double representation of the value
f(x) I'm trying to assess.
I'm not sure I'm perfectly clear...

I think (unless arbitrary precision is possible) that R could be used
for testing purposes, not for assesment of the actual accuracy.

I agree with you we should double check with another software, but we
need a multi-precision package with special functions implemented. I
don't know many (non-commercial) of them.

What do you think?
Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message