 Piotr Kochański <pi@uw.edu.pl> wrote:
<snip/>
> > My thought was that we could do some things (e.g. estimate confidence
> > intervals) without storing the boostrap samples or even the full set of
>
> > bootstrap statistics.
>
> This is not a problem at all. When we initialize EmpiricalDistribution
> using load(...) method, we can calculate what we want  we have
> data set at that moment.
>
> The problem I see is that we have to a priori specify for which
> statistics
> (bootstrap) confidence interval or standard error would be calculated.
>
> We should not make that decision for the user, so some configuration of
> EmpiricalDistribution object would be necessary, e.g.
>
> load(double[][], UnivariateStatistics[])
>
> then all the interesting calculation would be done for provided
> UnivariateStatistics. The default choice could be just SummaryStatistics:
> load(double[][]){
> statisticsToBeBootstrapped[] = All SummaryStatistics
> }
>
> If bootstrap samples are not provided, e.g. user uses other
> load function, we can provide confidence intervals based on the
> normal distribution assumption (for those statistics, for which
> it can be calculated).
>
> In fact we could leave the choice which summary statistics should
> be calculated to the user at all (e.g. for performance reason  someone
> would never be interested in calculating some statistics, but it is done
> anyway, which slows down initialization of the object).
>
> load(String, UnivariateStatistics[]) etc.
>
> Then present getSampleStats() method should return
> an object which enables access to calculated statistics and/or
> the confidence intervals for them.
>
Ah, now I understand what you have been trying to communicate and I agree
that adding all of this functionality to EmpiricalDistribution is not a
good idea. I was only considering the simple use case modelling the
sampling distribution of a single, known statistic. The more general case
in which the boostrap samples are leveraged for inferences about multiple
statistics will require more complex machinery. I suggest that we take
this up again post 1.0. For now, I don't think it makes sense to
significantly modify EmpiricalDistribution (though given the confusion, it
might be better to change the name :)
Phil
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder  Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/

To unsubscribe, email: commonsdevunsubscribe@jakarta.apache.org
For additional commands, email: commonsdevhelp@jakarta.apache.org
