commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [MATH] Test failure in Continuum
Date Mon, 06 Aug 2012 15:07:44 GMT




On Aug 6, 2012, at 6:08 AM, Gilles Sadowski <gilles@harfang.homelinux.org> wrote:

> On Mon, Aug 06, 2012 at 05:48:14AM -0700, Phil Steitz wrote:
>> 
>> 
>> 
>> 
>> On Aug 5, 2012, at 11:21 PM, Dennis Hendriks <D.Hendriks@tue.nl> wrote:
>> 
>>> See below.
>>> 
>>> On 08/06/2012 12:49 AM, Gilles Sadowski wrote:
>>>> On Sun, Aug 05, 2012 at 12:54:11PM -0700, Phil Steitz wrote:
>>>>> On 8/4/12 10:57 AM, Gilles Sadowski wrote:
>>>>>> Hello.
>>>>>> 
>>>>>> Referring to this failed test (cf. messages from Continuum):
>>>>>> ---CUT---
>>>>>> org.apache.commons.math3.exception.NumberIsTooLargeException: lower
bound (65) must be strictly less than upper bound (65)
>>>>>>   at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:73)
>>>>>>   at org.apache.commons.math3.distribution.UniformIntegerDistribution.<init>(UniformIntegerDistribution.java:53)
>>>>>>   at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.generatePartition(AggregateSummaryStatisticsTest.java:275)
>>>>>>   at org.apache.commons.math3.stat.descriptive.AggregateSummaryStatisticsTest.testAggregationConsistency(AggregateSummaryStatisticsTest.java:89)
>>>>>> 
>>>>>> 
>>>>>> It is due to a precondition check while creating the
>>>>>> "UniformIntegerDistribution" instance:
>>>>>> ---CUT---
>>>>>> if (lower>= upper) {
>>>>>>    throw new NumberIsTooLargeException(
>>>>>>                       LocalizedFormats.LOWER_BOUND_NOT_BELOW_UPPER_BOUND,
>>>>>>                       lower, upper, false);
>>>>>> }
>>>>>> ---CUT---
>>>>>> 
>>>>>> The test referred to above was using this code (before I changed
it use a
>>>>>> "UniformIntegerDistribution" instance):
>>>>>> ---CUT---
>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : randomData.nextInt(cur,
length - 1);
>>>>>> ---CUT---
>>>>>> 
>>>>>> It is now (after the change):
>>>>>> ---CUT---
>>>>>> final IntegerDistribution partitionPoint = new UniformIntegerDistribution(cur,
length - 1);
>>>>>> final int next = (i == 4 || cur == length - 1) ? length - 1 : partitionPoint.sample();
>>>>>> ---CUT---
>>>>>> 
>>>>>> Thus, AFAIK, the failure did not appear before because there was
no
>>>>>> precondition enforcement in "nextInt".
>>>>>> 
>>>>>> The question is: Was the code in the test correct (in allowing the
same
>>>>>> value for both bounds?
>>>>>> * In the negative, how to change it?
>>>>>> * The affirmative would mean that the precondition check in
>>>>>>   "UniformIntegerDistribution" should be relaxed to allow equal bounds.
>>>>>>   Does this make sense?
>>>>>>   If so, can we change it now, or is it forbidden in order to stay
>>>>>>   backwards compatible?
>>>>> 
>>>>> Your analysis above is correct.  The failure after the change is due
>>>>> to the fact that post-change the distribution is instantiated before
>>>>> the bounds check.  I changed the test to fix this.
>>>> 
>>>> Thanks.
>>>> 
>>>>> Both the
>>>>> randomData nextInt and the UniformIntegerDistribution constructor
>>>>> now forbid the degenerate case where there is only one point in the
>>>>> domain.  In retrospect, I guess it would have probably been better
>>>>> to allow this degenerate case.  Unfortunately, this would be an
>>>>> incompatible change, so will have to wait until 4.0 if we want to do
it.
>>>>> 
>>>>> The original code above illustrates the convenience of being able to
>>>>> just make direct calls to randomData.nextXxx, which is why this
>>>>> class exists ;)
>>>> 
>>>> As I wrote in another post, I'm not against the convenience methods. But
>>>> IMO, they should be located in a new "DistributionUtils" class.
>>>> And we should also find a way to remove the code duplication (in the
>>>> distribution's "sample()" method and in the corresponding "next..." method).
>>>> 
>>> 
>>> The RandomData class (or whatever it would be called) does indeed seem useful.
If we plan to keep it, we should probably make sure that there is a sample/next/... method
in that class for EVERY distribution, as some of them are missing, if I remember correctly.
Perhaps this is a separate issue though?
>>> 
>> 
>> All have the method now, but the impls delegate to RandomDataImpl.
> 
> They do not.

The method is in the interface.  There is a default, inversion-based implementation in the
abstract base class.  Not all distributions override the default impl.  The ones that do delegate
to the specialized methods in RandomDataImpl.

> 
>> In some cases, there is nothing better implemented than just inversion, provided
by the default inversion sampler.  That is OK.  What we need to do is just move the implementations
of the default and specialized samplers to the actual distribution classes.  These can't be
static, as they use the RamdomData instance.  I will take care of this.
> 
> Thanks for reading fully this message
>  http://markmail.org/message/5fpmwyiiw2xq4o3q

I don't get the desire to make sample() static.  It uses the parameters of the distribution
as well as the random generator.  Generally, repeated sampling calls will want to use the
same generator, so it is more convenient (both here and in RandomData) to use a RandomGenerator
provided at construction time.



> Gill
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message