commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <mdigg...@latte.harvard.edu>
Subject Re: [math][PATCH] was Re: [math] exceptions or NaN from Univariate
Date Wed, 14 May 2003 15:31:23 GMT


O'brien, Tim wrote:
> I can see why someone might want to use the Univariate implementation as
> implemented currently, it is fast and efficient and requires no
> storage.  If I'm trying to get Univariate stats for a group of 1000
> longs in J2ME I might be interested in a storage-less implementation of
> this. 
> 

You can should still be able to get this as default behavior, even with 
the changes I've proposed.

> I do see that if window == Integer.MAX_VALUE no storage is used, but I'm
> wondering if we might want to put this into another implementation -
> this implementation should also provide Mode.
> 

Possibly even higher order moments like kurtosis and skew.


This is a tough call, is it so big a difference in implementation that 
it requires its own class, or is the window simply a feature of a 
Rolling Univariate Stat. It is a conceptual argument. I say, if everying 
other than that one decision on storage is the same in the two 
hypothetical implmentations, that its probibly not a great enough 
difference to warrant two different implmentations. However, if it is a 
feature the effects the performance of a significant number of 
properties in the Class, maybe it should be separate. so far I only see 
it effecting one method computationally "insertValue".



> I'd like to get a sense from [math] of whether we should modify
> Univariate in place or make Univariate an interface and provide multiple
> implementations. 
> 

In my opinion, I'm not sure there would be enough other implmentations 
to warrant this.

> Also, using Integer.MAX_VALUE makes practical sense, but it might be
> better to choose a more "meaningless" default value that signifies
> infinity.  Double has the concept of POSITIVE_INFINITY, but integers do
> not.  "-1" is a common signal that a process has no positive upper
> limit.  I know this is a little bit of hair splitting, but I'd like to
> see what people think about this one.  I cannot forsee anyone needing to
> collect Univariate statistics on more than 2^31 - 1 elements, but I
> don't want to get in the business of introducing an arbitrary constant
> that causes some catastrophic failure.

Theres a limitation here on the size of the array itself we're dealing 
with. whats the largest int[] you can have in Java? This is a cap on 
"int" and array capabilities, having a Window of 
"Double.POSITIVE_INFINITY - 1" is impossible from an array size 
standpoint, even having a Window of Integer.MAX_VALUE + 1 is impossible, 
an array "Integer.MAX_VALUE - 1" is theoretically possible. 
Integer.MAX_VALUE is the cap (although difficult to achieve with todays 
memory constraints).


On a side note:

I also think I can save "computational effort" during the array rolling 
by tracking an index to start from and looping the forloop around the 
ends of the array with a modulus.

-Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message