Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@apache.org Received: (qmail 80429 invoked from network); 21 May 2003 19:15:59 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 21 May 2003 19:15:59 -0000 Received: (qmail 12831 invoked by uid 97); 21 May 2003 19:18:09 -0000 Delivered-To: qmlist-jakarta-archive-commons-dev@nagoya.betaversion.org Received: (qmail 12824 invoked from network); 21 May 2003 19:18:09 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 21 May 2003 19:18:09 -0000 Received: (qmail 79719 invoked by uid 500); 21 May 2003 19:15:55 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 79636 invoked from network); 21 May 2003 19:15:55 -0000 Received: from unknown (HELO discursive.com) (63.246.9.6) by daedalus.apache.org with SMTP; 21 May 2003 19:15:55 -0000 Received: (qmail 13913 invoked by uid 500); 21 May 2003 19:06:46 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 21 May 2003 19:06:46 -0000 Date: Wed, 21 May 2003 15:06:46 -0400 (EDT) From: Tim O'Brien To: Jakarta Commons Developers List Subject: Re: [math] Priorities, help needed In-Reply-To: <3ECBC9CE.1060500@latte.harvard.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Good foot work. GeometricMean is only valid for a set of positive data. Instead of introducing extensions to UnivariateImpl, StoredUnivariateImpl, and ListUnivariateImpl - it might be preferable to do this: 1. Trying to obtain a measure which is only valid for positive numbers on a set that contains non-positive values will return a Double.NaN by default. Double.NaN is a good marker for, "The answer to your question does not exist". This way we do not blindly discard zero values. Assume 1000 values and you add a zero into the set of values - getProduct() should correctly return 0.0 and getGeometricMean() should return Double.NaN. 2. There should two boolean properties flags setIgnoreZero() and setIgnoreNegative() - possibly a setIgnoreNonPositive() which would call the previous two methods. Depending on which flags are set, the Univariate implementation will globally ignore all zero or negative values. #2 would allow us to support measures which only have meaning for positive numbers, and it will require that an end user EXPLICITLY set flags that will ignore data. That is key to the design, any implementation which ignores data by default is unacceptable, if we are going to ignore data at all, it should only be ignored after explicit library user action. What do you think? This solution would avoid "class bloom" On Wed, 21 May 2003, Mark R. Diggory wrote: > See my last email concerning positive values. I agree with what you've > said below about geo <= arith and my little hack. I think forcing these > into an extension is possibly a better solution in terms of accurately > calculating these statistics and being able to control the values used > to calculate them properly. > > -Mark > > Tim O'Brien wrote: > > On Wed, 21 May 2003, Mark R. Diggory wrote: > > > > > >>I hadn't thought of that. muli*=0 is also very detrimental to this > >>calculation as well. Would it be logical that if the value is 0.0 that > >>it gets excluded from the divison/multiplication? > > > > > > Well, if the geometric mean depends on the product of all values and the > > set of values contains a zero value, I'd say that the result should be > > zero. Think of the following situations, would the geometric mean of a > > set of 8 zero values be 1.0? Geometric mean is always <= arithmetic mean, > > I'd say if it depends on a product it should reflect the product. > > > > > >>if(discard!=0) > >> mutli/=discard; > >> > >>if(value != 0) > >> multi*=discard; > >> > >>But if n < window then n still gets incremented. This is basically > >>treating 0 = 1 in nature. I'm not sure if its numerically kosher to do. > >> > >>Thus maybe we see the weakness of the Geometric Mean. > >>-Mark > >> > >> > >>Tim O'Brien wrote: > >> > >>>On Wed, 21 May 2003, Mark R. Diggory wrote: > >>> > >>> > >>>>multi/=discard; > >>> > >>> > >>>That doesn't work. Once you've introduced a value 0.0, there is no way to > >>>divide out a zero. > >>> > >>> > >>>--------------------------------------------------------------------- > >>>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org > >>>For additional commands, e-mail: commons-dev-help@jakarta.apache.org > >>> > >> > >> > >>--------------------------------------------------------------------- > >>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org > >>For additional commands, e-mail: commons-dev-help@jakarta.apache.org > >> > >> > >> > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: commons-dev-help@jakarta.apache.org > > > -- ---------------------- Tim O'Brien Evanston, IL (847) 863-7045 tobrien@discursive.com --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org