commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] Generate random data using the Inverse CDF Method?
Date Tue, 03 Nov 2009 12:54:17 GMT
Phil Steitz wrote:
> Mikkel Meyer Andersen wrote:
>> 2009/11/3 Luc Maisonobe <Luc.Maisonobe@free.fr>:
>>> There are at least one other regular commiter and three other committers
>>> that have been active on the list last year. Phil is clearly one of the
>>> most involved maintainers and he has been here since the beginning.
>> Okay, thanks for the info. I know how much Phil means and I haven't
>> for a second doubted that.
> 
> One important thing to understand about how things work here is that
> there is no hierarchy among committers and in terms of ideas,
> patches, itches-to-scratch, etc. all - including noncomitters - are
> on equal footing.  Just because I have been around for a while does
> not mean my ideas are any better than yours or anyone else's.
> 
>>> There are only two lists: the users list and the developers list (here).
>>> Both lists are archived and searchable.
>>>
>>> I have no preference on this specific topic, sorry. One important thing
>>> to me is also to keep backward compatibility (as strange as it might
>>> seem after the bunch of changes I introduced last summer).
>> I agree with this, at least to the degree where it is practically durable.
>>> Would the change imply that the random package would disappear ? In this
>>> case I would be against it. Would that change imply that low level "raw"
>>> generators would be in random and higher level generators in
>>> distribution ? In this case, I don't know what is better.
>>>
>>> One thing I would like to add at some time in the future would be better
>>> and more modern "raw" generators in the same spirit as the Mersenne
>>> Twister (typically I would like to add the WELL family of generators).
>>>
>>> From a user point of view, it is also important to be able to select a
>>> different raw generator underlying a high level one. This is used for
>>> example in Monte-Carlo analyses when one wants to reproduce a subset of
>>> an already generated sequence, or according to what has higher priority,
>>> generation speed or generation accuracy with respect to the desired
>>> repartition.
> 
> This is why I would like to keep the random data generation
> machinery in the random package.  As I stated elsewhere, I am +0/1
> on the idea of adding generic inversion-based generators that work
> with any invertible distribution; but I still do not see attaching
> them to the distribution implementations as a good idea.  This is
> for three reasons: 0) I see it as poor separation of concerns
> (admittedly this is a matter of taste, but I do not see sourcing
> random deviates as an essential behavior of a probability
> distribution)

A little more explanation of the separation of concerns issue.
Inference is another thing that one frequently does *with*
distributions.  This was in fact the application that led to
introduction of the first distributions in commons-math.  But would
we add hypothesis testing to the distributions themselves? Obviously
no.  It is interesting to ask for each distribution, how often would
you have need to either generate random data from it or perform
hypothesis tests using it.  In addition to the obvious question of
separation of concerns, the variability in the response to this is
another indication that neither of these are essential behaviors of
the class.

Phil

 1) if the implementation is *only* inversion-based, it
> will be naive for some distributions and we do not want users to get
> a bad impl by default 2) to fix 1) we have to essentially refactor
> our package structure to place random data generation into the
> distributions package, causing users to have to instantiate
> distributions and also configure generators to get deviates.  I see
> it as simpler and more natural to use a RandomData instance.  I am
> -1 on dropping the random package for the reasons that Luc states.
> Therefore, I am not in favor of attaching this functionality to the
> distributions.
> 
> Phil
> 
> 
>>> Luc
>> Cheers, Mikkel.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message