commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: [RNG] About RNG-6 (Was: [VOTE][RC1] Release Commons Rng 1.0)
Date Thu, 15 Sep 2016 18:45:57 GMT
After scanning this, I am wondering if a smoother path to 1.0 would be
really be through a beta or alpha release. The code is all there, any one
can look at it.

The advantage of a beta release is that it gets pushed out as a ANNOUNCE
email and might get more notice. YMMV.

Gary

On Thu, Sep 15, 2016 at 7:37 AM, Gilles <gilles@harfang.homelinux.org>
wrote:

> Hi.
>
> Moving to another thread, as your comments do not belong in
> a release vote.
>
> Many of them could lead to interesting discussions that should
> (and could) have happened earlier.
>
> Indeed, for the record, see
>   https://issues.apache.org/jira/browse/RNG-6
> and the numerous posts which I sent to this list
>  * while ironing out the code of this new component in its own
>    repository, and doing the various chores (site, userguide,
>    quality testing, Github, Travis, Coveralls, etc) for the last
>    6 weeks,
>  * while developing that code in a new Commons Math package
>    "o.a.c.math4.rng" for the last 6 months,
>  * while fixing the code in Commons Math "o.a.c.m.random"
>    package for the last 9 months.
>
> For a number of issues which you now raise, I've explicitly
> asked for input.
>
> I find it a bit strange that, at this precise point, you start
> to question the choices which I've made in the absence of any
> prior remarks, suggestions, or warnings.
>
> On Wed, 14 Sep 2016 17:04:53 +0200, Emmanuel Bourg wrote:
>
>> Hi all,
>>
>> RNG is moving fast, that's nice. I got a quick look at the API and the
>> site, here are my comments:
>>
>> - The main page (Overview) could be enhanced with a description of the
>> API longer than a single sentence. It could mention the motivations for
>> the API, the implementations supported, a sample code snippet and a link
>> to the user guide.
>>
>
> I don't understand "motivations for the API".
>
> Do you mean something like my comment in
>   https://issues.apache.org/jira/browse/RNG-6
> ?
>
> [My motivation for proposing this code can be gathered from
> the archives and the bugs and shortcomings reported in the
> JIRA MATH project, and related posts here). They do not
> belong in that "overview" page.]
>
> - In the Performance section of the User Guide, I'd suggest grouping the
>> 3 tables into a single one. It might also be interesting to put the
>> quality results in the same table, so we can see the trade-off between
>> performance and quality.
>>
>
> I won't do that.
> It was already a pain to create those tables as they are (see the
> doc source).
>
> - The License section of the user guide could be removed, I guess
>> everyone knows that Apache projects use the Apache license.
>>
>
> No problem.
> I thought that it was mandatory.
>
> All the pages were copied from the corresponding Commons Math ones.
>
> - The internal packages should be hidden from the Javadoc,
>>
>
> I do not know how do that.
>
> And I don't see the point, since from the "binary" POV, they
> are "public".
>
> otherwise
>> users may be tempted to directly use the classes there.
>>
>
> You must have seen that there are prominent warnings against
> doing that.
>
> Users will do what they want anyways.
> What is not supported is not supported.  We agreed here that it
> was enough to tag a class as "internal" in order to fulfill our
> obligation of least surprise.
>
> This would imply
>> copying the documentation of the generators into the RandomSource class.
>>
>
> I do not agree.
>
> Users can select a generator without knowing the gory details.
> If they want to know more, they go to the internals.
> It they really want to know, they go to the references.
>
> Internal should not be hidden: contributors who'd fancy to provide
> additional generators do not have to go through any hoops.
>
> Or do you suggest that this component should supply two JARs:
>   * commons-rng-api
>      Contents of "o.a.c.rng"
>   * commons-rng-core
>      Contents of "o.a.c.rng.internal"
> ?
>
> And two sets of Javadoc?
>
> - Why isn't the UniformRandomProvider interface simply named
>> RandomProvider?
>>
>
> Maybe because we want to be transparent on what the code does (?).
>
> With this name, there is no ambiguity: no more, no less than what
> the name implies.
>
> Is there any plan to implement non uniform random
>> generators in the future? Will they have a different interface?
>>
>
> This point has been discussed on this list (although, again, there
> was a severe lack of feedback...).
>
> Executive summary:
>   No strong coupling between generators (of uniform sequence of
>   pseudo-random numbers) and utilities on top of those generators
>   (e.g. generators producing non-uniform sequences).
>   Utilities should go in another component/package/module (see
>   e.g. Commons Math's packages "o.a.c.math4.distribution" and
>   "o.a.c.math4.random" for candidate code).
>
> - UniformRandomProvider mimics the java.util.Random methods which is
>> nice. Is there any value in also implementing the nextGaussian() method
>> such that our API is a perfect drop-in replacement for the JDK class?
>>
>
> No. [IMHO.]
> This was discussed on this list (and cf. above point).
>
> - For code relying on the java.util.Random type, it might be useful to
>> provide an adapter turning an UniformRandomProvider into a
>> java.util.Random implementation (Hipparchus has one).
>>
>
> This was discussed on this list.
>
> And see Commons Math's classes "o.a.c.math4.random.RandomUtils" and
> "o.a.c.math4.random.RngAdaptor" classes.
>
> In the latter (being deprecated as of its creation), I indicated why
> it is a bad idea.
>
> Again, if we really want to support the use-case you proposed, it's
> a utility that should go in another component.
>
> - The choice of an enum for the factory class RandomSource is a bit
>> unusual. It is probably ok for simple RNG with a simple seed, but with
>> more complex implementations I find it less intuitive to use.
>>
>
> Less intuitive than what?
>
> For example the create(RandomSource, Object Object...) method has this
>> snippet in its documentation:
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, 26219, 6, 9)
>>
>> When a user types this in its IDE, there is no type information nor
>> contextual documentation for the parameters expected. The user has to
>> know the type of the seed and the meaning of the parameters expected,
>> the IDE won't be able to help here (with CTRL+P and CTRL+Q in IntelliJ).
>>
>
> Honestly, I don't care.
> To put it less bluntly, the choice of RNG to instantiate is not a
> matter of IDE.
>
> Practically, the lack of type information (which I don't like either)
> was a trade-off to have a dead-simple API.
>
> As a user, one doesn't need an intelligent IDE to know all there is
> to know about Commons RNG, and it will take less than 10 minutes!
>
> This design induces some complexity, the loss of the seed type leads to
>> the seed conversion mechanism (most of the util package if I understand
>> well).
>>
>> I'd feel more comfortable with a more classic factory design, something
>> like:
>>
>>   RandomSource.createTwoCmres()
>>   RandomSource.createTwoCmres(Integer seed)
>>   RandomSource.createTwoCmres(Integer seed, int i, int j)
>>
>> or a shorter form like:
>>
>>   RNG.newTwoCmres()
>>   RNG.newTwoCmres(int seed)
>>   RNG.newTwoCmres(Integer seed, int i, int j)
>>
>> That will add more methods to the factory, but it'll be much more usable
>> in my opinion.
>>
>
> Not in mine.
>
> This design came after 9 months of sustained work.
> Nothing to be proud of in terms of LOC/day...
> But what you propose to drop is the only non trivial part of
> this work.
>
> As I said above, this is a trade-off, and it looks like a fault
> only because Java is strongly typed.
>
> The automatic seed conversion is a huge advantage for casual
> users; they can still seed them with a single "long" (and under
> the hood get a good "native" seed).
> But the doc is there to warn them that RNGs usually have a larger
> state.
>
> - I'm not convinced by the need for the convenience static methods
>> RandomSource.createInt/Long. If the user doesn't want to specify the
>> seed, he could simply use the seed-less create method. And if the random
>> generator requires extra parameters, he could use a null value for the
>> seed as a way to mean "generate one for me please":
>>
>>   RandomSource.create(RandomSource.TWO_CMRES_SELECT, null, 6, 9)
>>
>
> These methods take care of a common use-case, where the actual
> seed does not matter (hence the caller might like to not have to
> code similar methods), but the _same_ seed must be used in another
> execution of the same application to ensure reproducibility (e.g.
> for bug tracking).
>
> See also:
>   https://docs.oracle.com/javase/7/docs/api/java/security/
> SecureRandom.html#generateSeed%28int%29
>
> - RandomSource.saveState/restoreState allows one to convert a random
>> generator to/from a byte array. This looks a lot like Java
>> serialization. Maybe the implementations supporting this feature could
>> be marked as Serializable, thus it would be possible to call
>> RandomSource.saveState only if the instance is Serializable instead of
>> catching the UnsupportedOperationException thrown when the feature isn't
>> supported.
>>
>
> No, no, no.
> Avoiding the "Serializable" clutter was a _requirement_.
>
> Serialization is not part of this component, but is supported by all
> implementations.
>
> It's a feature that users can _choose_ the persistence framework
> they want and not be forced to consider the pitfalls of "Serializable".
> [See "Effective Java".]
>
> It's also a feature that this component will never have to deal
> with CVE related to exploitation of something "Serializable" that
> should not have been...
>
> - Why is the byte[] state encapsulated into the RandomSource.State
>> class? Why not using a byte array directly?
>>
>
> Because encapsulation is the hallmark of good OO programming?
>
> If some application only needs to save/restore without actual
> persistence, it can do without dealing with the "ugliness" of
> "byte[]".
> And it will continue to work even if the internal representation
> is changed.
>
> Exposing the internals was unavoidable for the serialization
> use-cases.
>
>
>> So far I'm +1 for a beta release aimed at getting more feedback on the
>> API,
>>
>
> There were a lot of discussions here about what "beta release"
> means and how to implement it in practice (package naming,
> versioning scheme).
>
> They never led anywhere.
>
> Snapshot JARs have been available since the Jenkins task has been
> up and running.
>
> I've asked for feedback on this list, on the site since it's been
> up, and on "helpwanted.apache.org".
>
> I seriously doubt that having code tagged "beta" would be more
> successful at getting users to provide feedback.
>
> It will more likely turn them away, thinking that the generators
> might be buggy.  Which is not the case.
>
> As much as I advocated it in order to have more timely releases
> of Commons Math, I think that in this case, a "beta" release
> (if such a thing existed in Commons) is nothing short of
> counter-productive.
>
> The API is what it is.  As for any other component; and, quite
> frankly, in proportion of the lines of code, much more thought
> was put into it than for anything in Commons Math, and probably
> many other Commons components).
>
> I do not deny that it might be wrong (or I wouldn't have filed
> "RNG-6") but at this point it is usage that should demonstrate
> it.  Not doubt based on... I don't know what.
>
> but I'm -1 for releasing the current code as 1.0.
>>
>
> Any technical reasons?
>
>
> Gilles
> [Truly sorry that this discussion did not happen at the proper
> time. And hoping that constructive feedback like this one will
> come when I start talking about "commons-rng-tools".]
>
>
>
>> Emmanuel Bourg
>>
>>
>> Le 11/09/2016 à 16:55, Gilles a écrit :
>>
>>> Hi.
>>>
>>> This is a [VOTE] for releasing Apache Commons Rng 1.0 (from RC1).
>>>
>>> Tag name:
>>>   RNG_1_0_RC1 (signature can be checked from git using 'git tag -v')
>>>
>>> Tag URL:
>>>
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=commons-rng.git;a=
>>> commit;h=f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>>
>>> Commit ID the tag points at:
>>>   f8d23082607b9f2c7be7f489eb09627722440ee5
>>>
>>> Site:
>>>   http://home.apache.org/~erans/commons-rng-1.0-RC1-site
>>>
>>> Distribution files:
>>>   https://dist.apache.org/repos/dist/dev/commons/rng/
>>>
>>> Distribution files hashes (SHA1):
>>>   a221e862c8ff970a9ca3e7fbd86c3200d1f8780a commons-rng-1.0-bin.tar.gz
>>>   689b2bfbdb1856d4f47851d75762aab42057805a commons-rng-1.0-bin.zip
>>>   40b7b1639eedf91b5fad5d38e6ebec01e659048f commons-rng-1.0-src.tar.gz
>>>   6296dbabde10169d6365bda99f2af6dcc191e515 commons-rng-1.0-src.zip
>>>
>>> KEYS file to check signatures:
>>>   http://www.apache.org/dist/commons/KEYS
>>>
>>> Maven artifacts:
>>>
>>>
>>> https://repository.apache.org/content/repositories/orgapache
>>> commons-1199/org/apache/commons/commons-rng/1.0/
>>>
>>>
>>> [ ] +1 Release it.
>>> [ ] +0 Go ahead; I don't care.
>>> [ ] -0 There are a few minor glitches: ...
>>> [ ] -1 No, do not release it because ...
>>>
>>> This vote will close in 72 hours, at 2016-09-14T15:10:00Z (this is UTC
>>> time).
>>> ----------
>>>
>>> Thanks,
>>> Gilles
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message