commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikkel Meyer Andersen <>
Subject Re: [math] EmpiricalDistribution
Date Tue, 06 Sep 2011 15:58:05 GMT
2011/9/6 Phil Steitz <>:
> On 9/6/11 12:00 AM, Mikkel Meyer Andersen wrote:
>> 2011/9/5 Phil Steitz <>:
>>> I have a couple of proposals for this class:
>>> 0) Merge the interface and impl.   This is consistent with what we
>>> are doing in some other places where we have only one implementation.
>> Fine with me.
>>> 1) Extend this class to actually provide a distribution - i.e.
>>> implement the Distribution interface.
>> Won't we have problems, e.g. with implementing cumulativeProbability?
> The idea I had was to interpolate within bins.  So to compute the
> cdf at x you would find its bin, sum the mass (based on number of
> original sample points contained, like the sampling does) of the
> bins below its containing bin and then use the defined kernel within
> bin to determine how much of its own bin's mass to include.
Seems reasonable. But: We might want to include a user specified
support - just simple (endpoints of an interval) - or else the highest
and lowest value specifies the support which might not be a good idea.
>>> 2) make the kernel used within bins configurable.  Currently, values
>>> are generated (and the cdf would be computed) assuming a Gaussian
>>> distribution within bins.  I think at least a uniform option should
>>> be provided.
>> +1, maybe it can be generalised to providing user-defined kernels.
> Good idea.  Need to think about how to enable that.
> Thanks!
> Phil
>>> Thanks in advance for any feedback on this or further suggestions
>>> for improvement.
>>> Phil
Cheers, Mikkel.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message