commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [Math] MATH-894
Date Thu, 15 Nov 2012 15:03:30 GMT
On 11/15/12 5:40 AM, Gilles Sadowski wrote:
> Hi.
>> These changes look fine to me and the addition of the compute method is
>> really nice.
> Oh, I should have asked whether you can also look at the source code in the
> trunk: Currently the "compute(UnivariateStatistic)" method is only
> implemented internally (in a private inner class of "DescriptiveStatistics").
> If you want this feature to be in the public API, could you create a new issue
> on JIRA? [Then we can see in which package this class should go.]

How to do this with good separation of concerns is an interesting
problem.  I agree with Gilles that making compute public in RDA is
not nice.  The basic problem is that we can't just pass around array
pointers as we would in C and we need a way to support applying a
method that can take an array and offset as argument without
exposing either the internal array or the offset.  A possibly better
approach might be to just create an interface - possibly housed in
.util - for methods that take double arrays and offsets and compute
something from them.  For example

public interface ArrayFunction {
    double evaluate(double[] values, int begin, int length);

Then in RDA, we add compute(ArrayFunction) as a public method.  Then
if we make UnivariateStatistic extend this new interface,
DescriptiveStatistics can get what it needs from this.

Just at thought.  Would love to get better ideas on this.  What is
in trunk now works; but having to subclass for internal use makes me
wonder if we have solved the problem.


>> Looking more closely at my code, I am using the getElements
>> method. As long as that remains available, it make sense to deprecate the
>> getInternalValues method.
>> My use of ResizeableDoubleArray is related to an earlier discussion -
>> missing values. My data are stored in a database and it may contain missing
>> values. I know how many cases are in the database, but I don't know the
>> amount of missing data. I read the non-missing database values into a
>> ResizeableDoubleArray, call getElements() and use the nonmissing data array
>> in my calculations. It may be a bit clunky, but it's one of the ways I
>> handle missing data without looping over the database twice. I don't have a
>> solution for comprehensive treatment of missing data yet, but I appreciate
>> the conversation we are having.
> I'm afraid I don't follow you; I don't see the connection between missing
> values and the resizeable array. Maybe a small code example would help me.
> Regards,
> Gilles
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message