commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From F Norin <>
Subject Re: [math] Questions regarding probability distributions
Date Thu, 14 Oct 2004 15:31:22 GMT

> > Well, the problem is this: If I need to create some custom discrete
> > distribution that doesn't take on integer values, what interface should I
> > implement? With your model I have no choice but use the
> > ContinuousDistribution interface even though the distribution *isn't*
> > continuous. Does that make sense?
> Can you provide a practical example of this?  IIUC, what you are really
> arguing for is changing the int's in the DiscreteDistribution interface to
> doubles. This has the advantage of greater generality but makes it
> slightly less convenient for implementors of the most common discrete
> distributions, where the values are integers.

Well, changing the int's in the DiscreteDistribution interface to doubles is 
kind of a workaround, but I don't think it will settle the issue for good,  
see below.

As for examples, you can take *any* mixed distribution as an example of what I
mean. Consider a random variable X with domain D that can be partitioned
into subsets A and B such that

1. A is a countable set and 0 < P(X is in A) < 1
2. P(X = x) = 0 for all x in B

How would the distribution for such a random variable be represented in
your framework?

As a simple example of this, consider a random variable with the density

f(x) = 0.5 for x=0
f(x) = 0.5 for 1<x<2

How does this distribution fit into your framework? Sure, you could have
it implement the ContinuousDistribution interface but it *isn't* a
continuous distribution (in the sense that it doesn't conform to the
definition of a continuous distribution in probability theory) - and
then it shouldn't implement an interface called ContinuousDistribution.

Recall: A random variable is continuous if its distribution function P(X <= x) 
can be expressed as the Riemann-integral of some integrable function 
f: R -> [0, infinity)

The basic problem is that you have an implicit assumption in your
framework that each and every probability distribution can be classified
as being either discrete or continuous . That is simply not true.
Discrete and continuous distributions are really only special cases of
a broader concept. Aside from that you also have the problem of how to
handle the case of a discrete distribution that doesn't take on integer

Note: There are also distributions that are neither discrete, continuous or a 
mixture of the two. For example, there are numerous distributions based upon  
the Cantor ternary sets.

The bottom line is that you *cannot* do without a generic
ProbabilityDistribution interface.
This interface should expose a method that exists for all and completely
determines a particular probability distribution, such as the
distribution function P(X <= x).

As an easy solution, you could define it as

public interface ProbabilityDistribution {
        public double distributionFunction(double x);

and have ContinuousDistribution and DiscreteDistribution extend it.

This should work ok (though the name DiscreteDistribution is misleading)
but if you want a completely generic and typesafe definition you should
go for something like

public interface ProbabilityDistribution {
        public Probability distributionFunction(Number x);

where Number is the standard java.lang.Number and Probability is a new
class that would need to be defined. Since probability measure is the
fundamental concept in probability theory, having a Probability class is
probably a good idea anyway. It could look something like this

public final class Probability implements Serializable {

    public static final Probability PROPABILITY_ONE = new
    public static final Probability PROPABILITY_ZERO = new

    private double value;

    public Probability(double v) throws IllegalArgumentException {
        if (v < 0 || v > 1) {
            throw new IllegalArgumentException("Illegal probability
        this.value = v;

    public double value() {
        return value;

(A custom equals method should also be provided of course.)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message