Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 95C559C93 for ; Thu, 23 Feb 2012 21:44:16 +0000 (UTC) Received: (qmail 12907 invoked by uid 500); 23 Feb 2012 21:44:14 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 12498 invoked by uid 500); 23 Feb 2012 21:44:14 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 12356 invoked by uid 99); 23 Feb 2012 21:44:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 21:44:10 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 21:44:08 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id D7A75337278 for ; Thu, 23 Feb 2012 21:43:48 +0000 (UTC) Date: Thu, 23 Feb 2012 21:43:48 +0000 (UTC) From: "Thomas Neidhart (Updated) (JIRA)" To: issues@commons.apache.org Message-ID: <2067406993.11909.1330033428884.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <773238334.35670.1313216907163.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (MATH-644) for the class of hyper-geometric distribution, for some number the method "upperCumulativeProbability" return a probability greater than 1 which is impossible. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MATH-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Neidhart updated MATH-644: --------------------------------- Comment: was deleted (was: I digged a bit into the problem. The HypergeometricDistribution calculates the probability for a given x using the following formula: {code:java} private double probability(int n, int m, int k, int x) { return FastMath.exp(ArithmeticUtils.binomialCoefficientLog(m, x) + ArithmeticUtils.binomialCoefficientLog(n - m, k - x) - ArithmeticUtils.binomialCoefficientLog(n, k)); } {code} Thus it transforms the binomial coefficients to a logarithmic scale in order to cope with the possibly large results (and to be able to compute the bincoeff at all). But, imho, the reverse transformation is broken, as it does not take any scaling into account. As the coefficients get larger (e.g. due to a large n), the differences between the terms will become smaller in log scale, and thus incorrectly transformed back to linear scale. Taking scaling into account, the exp function will most likely fail for large n. I have created a simple test to illustrate the problem, the t{x} correspond to the binomial coeff terms from the formula, diff is the input the the exp function. This loop simulates an increasing n, and the expectation is that the result should get smaller with increasing n: {code} t1=0.0, t2=4547.288942497606, t3=4770.9627189150215, diff=-223.67377641741587, result=7.23957639711833E-98 t1=0.0, t2=12183.221706275828, t3=12186.968419291079, diff=-3.7467130152508616, result=0.023595175513309037 t1=0.0, t2=13444.672093808651, t3=13446.561352727935, diff=-1.8892589192837477, result=0.15118380673528464 t1=0.0, t2=14186.229425843805, t3=14187.492505971342, diff=-1.2630801275372505, result=0.2827816800804864 t1=0.0, t2=14713.395226772875, t3=14714.343882929534, diff=-0.9486561566591263, result=0.3872610921706871 t1=0.0, t2=15122.726785860956, t3=15123.486358374357, diff=-0.7595725134015083, result=0.46786639087791215 t1=0.0, t2=15457.396298892796, t3=15458.029636271298, diff=-0.6333373785018921, result=0.5308173033122051 t1=0.0, t2=15740.484181590378, t3=15741.027263000607, diff=-0.5430814102292061, result=0.5809553297318205 t1=0.0, t2=15985.787659011781, t3=15986.263000234962, diff=-0.47534122318029404, result=0.6216728910682101 t1=0.0, t2=16202.21559868753, t3=16202.638224512339, diff=-0.4226258248090744, result=0.6553237931479635 t1=0.0, t2=16395.855738580227, t3=16396.236174091697, diff=-0.380435511469841, result=0.6835636445695273 {code} hmm, after some second thoughts, I am not sure if the analysis is correct, and the problem is hidden somewhere else.) > for the class of hyper-geometric distribution, for some number the method "upperCumulativeProbability" return a probability greater than 1 which is impossible. > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: MATH-644 > URL: https://issues.apache.org/jira/browse/MATH-644 > Project: Commons Math > Issue Type: Bug > Affects Versions: 2.2 > Reporter: marzieh > Priority: Minor > Labels: hypergeometric, probability > Fix For: 3.1 > > > In windows 7, I used common.Math library. I used class "HypergeometricDistributionImpl" and method "upperCumulativeProbability" of zero for distribution and the return value is larget than 1. the following code is working error. > HypergeometricDistributionImpl u = new HypergeometricDistributionImpl(14761461, 1035 ,1841 ); > System.out.println(u.upperCumulativeProbability(0)) > Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira