commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: [math] Questions regarding probability distributions
Date Wed, 13 Oct 2004 08:55:30 GMT

> Some may disagree with that definition (really amounts to a convention on
> how to handle vacuuous products); but I think that I agree with you.
> Unless others object, I will consider this a bug and make the change.
> Thanks for pointing this oddity out. Would you mind opening a bugzilla
> ticket to track this?

Fixed that.

> > There is no requirement that
> > a discrete distribution only takes on integer values so the methods of
> > interface DiscreteDistribution doesn't cover all discrete distributions.
> Technically, you are correct - a discrete random variable can take on any
> (countable) set of values. The key difference, however, is that for a
> discrete random variable X, if x is one of the values that it takes with
> positive probability, p(X=x) is non-zero. Therefore, it makes sense to
> have probability(x) defined (as it is) in the DiscreteDistribution
> interface, but absent from the ContinuousDistribution interface.  More
> care also needs to be taken in the discrete case to define and implement
> p(a < X < b) type quantities including or excluding endpoints
> appropriately (since it makes a difference whether or not endpoints are
> included).
> In fairness, the DiscreteDistribution interface only supports
> integer-valued distributions. The only ones that I have ever used are
> integer-valued, and any odd countable set can be mapped into the integers,
> so I don't personally view this as a serious limitation.
>   On
> > the other hand, all of the methods of ContinuousDistribution holds
> > equally well for a discrete probability distribution.
> With slightly different semantics, as noted above.
>   In my opinion, a more
> > appropriate approach would be to rename ContinuousDistribution to
> > ProbabilityDistribution and drop the DiscreteDistribution interface. Of
> > course, it could be practical to have  convenience methods that takes
> > integer arguments for the probability densities for certain distributions
> > but then you can define a new interface IntegerValuedDistribution like
> >
> > public interface IntegerValuedDistribution extends
> > ProbabilityDistribution  { double probability(int x);
> >     double cumulativeProbability(int x) throws MathException;
> > }
> I see your point here and it is legitimate.  I am -0 to making the change,
> however, since the interface contract (for the base interface) would in
> practical terms be different for the discrete case, so I think it is
> better to keep them separate.

Well, the problem is this: If I need to create some custom discrete 
distribution that doesn't take on integer values, what interface should I 
implement? With your model I have no choice but use the 
ContinuousDistribution interface even though the distribution *isn't* 
continuous. Does that make sense?

You're right that the domain of every discrete distribution could be mapped 
into the integers, but then there should be some mechanism for internalizing 
this mapping for a particular distribution. Every discrete distribution 
should have a reference to a Domain object that provides a method for mapping 
every object of the set into the integers. Using such a scheme you could keep 
separate interfaces for discrete and continuous distributions. (Actually, the 
continuous distributions could have a Domain object too).

> > 4.  Since the chi-squared and exponential distributions are just special
> > cases of the gamma distribution, there is no need to have separate
> > implementation classes for these. In my opinion, one should avoid having
> > multiple implementations of the same distribution unless there is some
> > strong reason for it.
> Well, the chi-square implementation wraps a gamma instance.  The
> exponential implementation computes the exponential density explicitly.
> In any case, the distributions are different (though related) and
> sufficiently useful in their own rights that exposing them separately will
> be easier for users.

See my answer to another post.

> > 5. There are quite a lot of elementary distributions missing.
> > I wrote an implementation of the Poisson distribution while testing the
> > package and have attached the files for it.
> Yes!! That is why the framework was designed to support adding
> distributions. Could you open a bugzilla ticket and attach the files
> there?  I will review and get these in ASAP.

Fixed that.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message