commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [MATH] Should Frequency treat NaN specially?
Date Tue, 16 Jul 2013 14:24:19 GMT
On 7/16/13 2:07 AM, sebb wrote:
> On 16 July 2013 07:02, Phil Steitz <phil.steitz@gmail.com> wrote:
>> On 7/15/13 6:55 PM, sebb wrote:
>>> At present the Frequency class seems to treat NaN as just another Comparable.
>>> On my system it sorts as above POSITIVE_INFINITY.
>>>
>>> So getMode() can return NaN entries, as can the iterators.
>>>
>>> Is this reasonable behaviour?
>>> Should it be documented and therefore tested?
>>>
>>> Or should NaN be disallowed from entering the frequency table?
>>>
>>> I suppose it would be easy enough to subclass Frequency and override
>>> incrementValue(Comparable<?> v, long increment) to reject NaNs.
>>>
>>> But then we would need to document that addValue(Comparable<?> v)
>>> calls it - or the subclass would need to override both to be sure.
>> My opinion is that Frequency should not treat Double.NaN specially.
>> This is because Frequency *requires* a total ordering of whatever
>> domain objects it works with and the semantics of all of its methods
>> are defined in terms of the domain and the specified comparator.
>> The (default) natural ordering (expressed by Double.compareTo) puts
>> NaN at the top of the ordering (making it inconsistent with equals,
>> but keeping it total).  I don't think it is a good idea to try to
>> put special code in to restrict the domain just for Doubles.   It
>> probably is a good idea to document and test the current behavior;
>> though I suspect that (other than the case below) Double will rarely
>> be used as the domain of a Frequency instance, as discrete
>> distribution values are most often mapped onto integers.
> OK, I'll add comments and tests for Double.NaN.
>
>> In StatUtils, on the other hand, I think it makes sense to drop NaNs
>> from mode(double[] sample), since in this case we know the arguments
>> are doubles and we know what NaN means (or more precisely, what it
>> does not mean :)
> So the StatUtils method must drop any NaN entries itself.
Right.

Phil
>
>> Phil
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message