From issues-return-22263-apmail-commons-issues-archive=commons.apache.org@commons.apache.org Sat Oct 29 04:24:03 2011
Return-Path:
X-Original-To: apmail-commons-issues-archive@minotaur.apache.org
Delivered-To: apmail-commons-issues-archive@minotaur.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 542119762
for ; Sat, 29 Oct 2011 04:24:03 +0000 (UTC)
Received: (qmail 90949 invoked by uid 500); 29 Oct 2011 04:24:01 -0000
Delivered-To: apmail-commons-issues-archive@commons.apache.org
Received: (qmail 90773 invoked by uid 500); 29 Oct 2011 04:23:55 -0000
Mailing-List: contact issues-help@commons.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: issues@commons.apache.org
Delivered-To: mailing list issues@commons.apache.org
Received: (qmail 90765 invoked by uid 99); 29 Oct 2011 04:23:54 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 29 Oct 2011 04:23:54 +0000
X-ASF-Spam-Status: No, hits=-2000.5 required=5.0
tests=ALL_TRUSTED,RP_MATCHES_RCVD
X-Spam-Check-By: apache.org
Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116)
by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 29 Oct 2011 04:23:52 +0000
Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116])
by hel.zones.apache.org (Postfix) with ESMTP id 7BC1E326BD2
for ; Sat, 29 Oct 2011 04:23:32 +0000 (UTC)
Date: Sat, 29 Oct 2011 04:23:32 +0000 (UTC)
From: =?utf-8?Q?S=C3=A9bastien_Brisard_=28Commented=29_=28JIRA=29?=
To: issues@commons.apache.org
Message-ID: <2632100.35629.1319862212543.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: <1175555992.7216.1318960990578.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (MATH-692) Cumulative probability and inverse
cumulative probability inconsistencies
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394
[ https://issues.apache.org/jira/browse/MATH-692?page=3Dcom.atlassian.j=
ira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D131391=
05#comment-13139105 ]=20
S=C3=A9bastien Brisard commented on MATH-692:
----------------------------------------
Hi Christian,
{quote}
Hi S=C3=A9bastien,
the problem with the plateau is indeed one issue which needs to be solved.
{quote}
I'm working on it...
{quote}
Additionally, AbstractDistribution will need an implementation of inverseCu=
mulativeProbability. In fact both implementations should be the same except=
for the solver to be used. Thus inverseCumulativeProbability should be imp=
lemented just once in AbstractDistribution, and invoking the solver should =
be put to a separate procedure so that it can be overridden in AbstractCont=
inuousDistribution.
{quote}
OK, for now, I'm concentrating on making the current impl in {{AbstractCont=
inuousDistribution}} more robust. The other impl should be easier.
{quote}
A third point is the choice of the solvers. For AbstractDistribution we nee=
d a solver which works even for discontinuous cdfs (BisectionSolver can do =
the job, but maybe the implementations of the faster IllinoisSolver, Pegasu=
sSolver, BrentSolver, or another solver can cope with discontinuities, too)=
. For AbstractContinuousDistribution it would be beneficial to use a Differ=
entiableUnivariateRealSolver. However, the NewtonSolver cannot be used due =
to uncertainty of convergence and an alternative doesn't seem to exist by n=
ow. So we have to choose one of the other solvers for now.
{quote}
The current implementation uses a Brent solver. I think the solver itself i=
s only one side of the issue. The other point is the algorithm used to brac=
ket the solution, in order to ensure that the result is consistent with the=
definition of the cumprob. As for the {{DifferentiableUnivariateRealSolver=
}}, I'm not too sure. I guess it depends on what is meant by "continuous di=
stribution". For me, it means that the random variable takes values in a co=
ntinuous set, and possibly its distribution is defined by a density. Howeve=
r, in my view, nothing prevents occurences of Dirac functions, in which cas=
e the cum sum is only piecewise C1. It's all a matter of definition, of cou=
rse, and I'll ask the question on the forum to check whether or not people =
want to allow for such a situation.
{quote}
As all these points are interdependent, I guess it's best to solve them as =
a whole. If you like, you can do this.
Best Regards,
Christian
{quote}
Yes, I'm very interested.
Best regards,
S=C3=A9bastien
=20
> Cumulative probability and inverse cumulative probability inconsistencies
> -------------------------------------------------------------------------
>
> Key: MATH-692
> URL: https://issues.apache.org/jira/browse/MATH-692
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 1.0, 1.1, 1.2, 1.3, 2.0, 2.1, 2.2, 2.2.1, 3.0
> Reporter: Christian Winter
> Priority: Minor
> Fix For: 3.0
>
>
> There are some inconsistencies in the documentation and implementation of=
functions regarding cumulative probabilities and inverse cumulative probab=
ilities. More precisely, '<' and '<=3D' are not used in a consistent way.
> Besides I would move the function inverseCumulativeProbability(double) to=
the interface Distribution. A true inverse of the distribution function do=
es neither exist for Distribution nor for ContinuosDistribution. Thus we ne=
ed to define the inverse in terms of quantiles anyway, and this can already=
be done for Distribution.
> On the whole I would declare the (inverse) cumulative probability functio=
ns in the basic distribution interfaces as follows:
> Distribution:
> - cumulativeProbability(double x): returns P(X <=3D x)
> - cumulativeProbability(double x0, double x1): returns P(x0 < X <=3D x1) =
[see also 1)]
> - inverseCumulativeProbability(double p):
> returns the quantile function inf{x in R | P(X<=3Dx) >=3D p} [see also =
2), 3), and 4)]
> 1) An aternative definition could be P(x0 <=3D X <=3D x1). But this requi=
res to put the function probability(double x) or another cumulative probabi=
lity function into the interface Distribution in order be able to calculate=
P(x0 <=3D X <=3D x1) in AbstractDistribution.
> 2) This definition is stricter than the definition in ContinuousDistribut=
ion, because the definition there does not specify what to do if there are =
multiple x satisfying P(X<=3Dx) =3D p.
> 3) A modification could be defined for p=3D0: Returning sup{x in R | P(X<=
=3Dx) =3D 0} would yield the infimum of the distribution's support instead =
of a mandatory -infinity.
> 4) This affects issue MATH-540. I'd prefere the definition from above for=
the following reasons:
> - This definition simplifies inverse transform sampling (as mentioned in =
the other issue).
> - It is the standard textbook definition for the quantile function.
> - For integer distributions it has the advantage that the result doesn't =
change when switching to "x in Z", i.e. the result is independent of consid=
ering the intergers as sole set or as part of the reals.
> ContinuousDistribution:
> nothing to be added regarding (inverse) cumulative probability functions
> IntegerDistribution:
> - cumulativeProbability(int x): returns P(X <=3D x)
> - cumulativeProbability(int x0, int x1): returns P(x0 < X <=3D x1) [see a=
lso 1) above]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrato=
rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp=
a
For more information on JIRA, see: http://www.atlassian.com/software/jira