Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@www.apache.org Received: (qmail 57283 invoked from network); 16 Feb 2004 14:44:18 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 16 Feb 2004 14:44:18 -0000 Received: (qmail 58804 invoked by uid 500); 16 Feb 2004 14:44:10 -0000 Delivered-To: apmail-jakarta-commons-dev-archive@jakarta.apache.org Received: (qmail 58720 invoked by uid 500); 16 Feb 2004 14:44:10 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 58707 invoked from network); 16 Feb 2004 14:44:10 -0000 Received: from unknown (HELO hume.tsdinc.steitz.com) (209.249.229.10) by daedalus.apache.org with SMTP; 16 Feb 2004 14:44:10 -0000 content-class: urn:content-classes:message Received: from Lavoie.tsdinc.steitz.com ([209.249.229.4]) by hume.tsdinc.steitz.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 16 Feb 2004 09:44:06 -0500 Received: from steitz.com ([130.13.97.180]) by Lavoie.tsdinc.steitz.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 16 Feb 2004 09:44:06 -0500 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Message-ID: <4030D7E4.2050704@steitz.com> Date: Mon, 16 Feb 2004 07:47:00 -0700 From: "Phil Steitz" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020830 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Jakarta Commons Developers List" Subject: Re: [math] EmpiricalDistribution improvments References: <402B063C.4050902@steitz.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 16 Feb 2004 14:44:06.0670 (UTC) FILETIME=[5439FAE0:01C3F49B] X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Piotr Kocha�ski wrote: > Phil Steitz wrote: > > >>1. Either remove or implement the "not implemented yet" distribution >>persistence methods. I am ambivalent on these, maybe just supporting >>serialization is enough. > > > The question is if it happens very often that we obtain data in the > form of the EDF. This might be the case if data are pre-processed > using different application (or experimental equipment)... The use case that I had in mind was repeated simulation runs using the same source dataset -- for this it would be handy to be able to digest a large dataset once and then reload just the digest (EDF) for subsequent runs. > > I'm thinking about the best form in which EmpiricalDistribution can be > saved, > maybe saving pairs > observed_value_i = probability_i > would do the job? There is more data than that -- remember the bin stats, etc. If we want to do it in a platform-independent way, that will be interesting; otherwise we could just serialize the whole mess using Java serialization (hence the comment that maybe just implementing Serializable is enough). > > >>3. Develop some sort of rationale for the test tolerances. This is an >>interesting mathstat problem. I would ideally like to use statistical >>tests (like elsewhere in the random package), but it is not obvious what >>the right test or test parameters should be. > > > As long as we test means or variances we can use t test or some variance > equality test (Levene test). However we need to choose significane level > anyway, so still there is a arbitrary number (like "tolerance" we have > now), > on the other hand this number have clear interpretation. Yes, that is the problem. I don't see how exactly we can correctly set df for the t-test, for example, since the sampling distribution of the "mean of EDF-generated values" is sort of an ugly beast that depends on the the number and dispersion of the origial values as well as the number of bins and the number of generated values. Phil > > Piotr > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: commons-dev-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org