On 9/17/07, William J Rust <wjr@weru.ksu.edu> wrote:
> I'm working on a climate simulation program that takes monthly averages
> and generates daily readings that are assumed to be normally
> distributed. The following program creates 10 sets of 100,000 random
> deviates with mean 10 and SD 5. It then applies a t test (results below)
> to ensure that the generated numbers are good enough. As the results
> show, they aren't. I'm wondering a) I am doing something wrong or b) is
> there something wrong with the stats routines?
There are a couple of problems here. First, while your inversion
method should generate approximately normally distributed values, it
is better to use the JDKsupplied method for this (much faster and a
better algorithm). There is a wrapped version of this provided in
org.apache.commons.math.random.RandomDataImpl. To use that:
import org.apache.commons.math.random.RandomData;
import org.apache.commons.math.random.RandomDataImpl;
RandomData randomData = new RandomDataImpl();
...
arry[idx] = randomData.nextGaussian(10, 5);
Second, I don't understand what you are expecting from the ttest.
TestUtils.tTest(mu, array) returns the pvalue associated with a
twotailed test with the null hypothesis that the values in the array
come from a distribution with mean = mu. So small pvalues, say less
than .01, would indicate that the mean appears to differ significantly
from 10. This should happen roughly one in every 100 times.
Differences as large as what you observed on your first run should
happen about 34 out of every 100 times, etc. The values reported
below do not look surprising to me. They do not support rejecting the
null hypothesis that the mean is what it is supposed to be, which is a
good thing.
To test normality of the deviates, you should apply a normality test
to the deviates themselves, e.g. a KolmogorovSmirnov test. Commons
math does not currently include normality tests (patches welcome :).
To do this, you would need to dump the generated arrays to a file and
then do the test with R or some other package that includes normality
tests.
Unless I am missing something, I don't think a ttest is going to give
you the information that you need to verify that the generated values
are normally distributed. Another thing that you could do is to
examine the empirical distribution of the generated values  lay a
grid over the range and count how many fall into each range and
compare these counts to what you would expect under the hypothesis of
normality (essentially what the KS test does). You can use
org.apache.commons.random.EmpircalDistribution to bin the generated
data and get bin counts.
If you do find that normality tests fail on the generated values using
either your inversion method or the RandomDataImpl.nextGaussian
method, please open a Jira ticket
(http://commons.apache.org/math/issuetracking.html) including the R
script or output from the package that you used for testing. Thanks!
hth,
Phil
>
> Thanks,
>
> wjr
>
> package usda.weru.cligen2;
>
> import org.apache.commons.math.MathException;
>
> /**
> *
> * @author wjr
> */
> public class TestNormal {
>
> static org.apache.commons.math.distribution.NormalDistributionImpl nd =
> new
> org.apache.commons.math.distribution.NormalDistributionImpl(10, 5);
>
> public static void main(String[] args) {
> double[] arry = new double[100000];
> java.util.Random ran = new java.util.Random(1l);
>
> for (int jdx = 0; jdx < 10; jdx++) {
> for (int idx = 0; idx < arry.length; idx++) {
> try {
> arry[idx] =
> nd.inverseCumulativeProbability(ran.nextDouble());
> } catch (MathException ex) {
> ex.printStackTrace();
> }
> }
> try {
> System.out.println("ttest " +
> org.apache.commons.math.stat.inference.TestUtils.tTest(10,arry));
> } catch (IllegalArgumentException ex) {
> ex.printStackTrace();
> } catch (MathException ex) {
> ex.printStackTrace();
> }
> }
> }
> }
>
> Output:
>
> >
> > runsingle:
> > ttest 0.3433300114960922
> > ttest 0.1431930575825282
> > ttest 0.12336027805916228
> > ttest 0.49478850669361796
> > ttest 0.9216887341410063
> > ttest 0.9937228334312525
> > ttest 0.13669784550400177
> > ttest 0.9646134537758599
> > ttest 0.9965741269090211
> > ttest 0.03815948891784959
> > BUILD SUCCESSFUL (total time: 20 seconds)
>
>
>
> 
> To unsubscribe, email: userunsubscribe@commons.apache.org
> For additional commands, email: userhelp@commons.apache.org
>
>

To unsubscribe, email: userunsubscribe@commons.apache.org
For additional commands, email: userhelp@commons.apache.org
