commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <mdigg...@latte.harvard.edu>
Subject Re: [math] RandomData and ValueServer Failures . . .
Date Tue, 17 Jun 2003 01:34:01 GMT


Al Chou wrote:
> --- "Mark R. Diggory" <mdiggory@latte.harvard.edu> wrote:
> 
>>I appear to "occasionally" get JUnit test failures from ValueServer and 
>>RandomData Tests. This would appear to be because the mean sampled 
>>values can sometimes deviate from the expected mean even for 1000 case 
>>draws, I know this happens "rarely", just enough over the last month or 
>>so for me to start to notice this behavior. Oldly, when its off, its off 
>>in a big way, so its not just a matter of changing the tolerance.
> 
> 
> Approximately how big is "off in a big way"?  Is it because a pseudorandom
> number generator is used dynamically in the test?  I guess I should look at the
> test code (which I'll try to do on the train home), but offhand it surprises me
> that the tests are ever far off from their expected results.
> 

Greater than the tolerance for the tests, which is surprising to me 
because its set to be 0.1, the last time I saw it fail the value was 
approx 5.1xxxxxxxx. Remember, this probibly doesn't happen very often. 
It would be interesting to run a batch test and actually get an estimate.

assertEquals("mean", 5.069831575018909, stats.getMean(), tolerance);

What one has to consider, is that stats.getMean() is a sample mean that 
can "vary" about the range of variance for the sampled values. Rarely, 
the sample is of poor enough quality not to effectively describe the 
populations mean and variance. There is always this small probiblity 
that the mean of the sample will not match the mean of the values. So 
testing with an "assertEquals" isn't vary helpfull in terms of getting 
the mean (I remember another discussion being had earlier on the list 
concerning having something like an assertApproximatelyEquals.

But, what I think we could really use are statistically based assertion 
tests!!!!!!!!!!!!!!!! :-)

double[] population;
double tolerance = 0.05;

assertStudentsT("mean", population, stats, tolerance);

then the test would be more like a t-test of sorts. Testing if the 
sampled set of values is "significantly different" than the population 
set. I get the feeling that this would be a stronger assertion than 
testing if the means and standard deviations are equal.

Any Ideas?
-Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message