commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: [MATH] Should Frequency treat NaN specially?
Date Tue, 16 Jul 2013 09:07:47 GMT
On 16 July 2013 07:02, Phil Steitz <phil.steitz@gmail.com> wrote:
> On 7/15/13 6:55 PM, sebb wrote:
>> At present the Frequency class seems to treat NaN as just another Comparable.
>> On my system it sorts as above POSITIVE_INFINITY.
>>
>> So getMode() can return NaN entries, as can the iterators.
>>
>> Is this reasonable behaviour?
>> Should it be documented and therefore tested?
>>
>> Or should NaN be disallowed from entering the frequency table?
>>
>> I suppose it would be easy enough to subclass Frequency and override
>> incrementValue(Comparable<?> v, long increment) to reject NaNs.
>>
>> But then we would need to document that addValue(Comparable<?> v)
>> calls it - or the subclass would need to override both to be sure.
>
> My opinion is that Frequency should not treat Double.NaN specially.
> This is because Frequency *requires* a total ordering of whatever
> domain objects it works with and the semantics of all of its methods
> are defined in terms of the domain and the specified comparator.
> The (default) natural ordering (expressed by Double.compareTo) puts
> NaN at the top of the ordering (making it inconsistent with equals,
> but keeping it total).  I don't think it is a good idea to try to
> put special code in to restrict the domain just for Doubles.   It
> probably is a good idea to document and test the current behavior;
> though I suspect that (other than the case below) Double will rarely
> be used as the domain of a Frequency instance, as discrete
> distribution values are most often mapped onto integers.

OK, I'll add comments and tests for Double.NaN.

> In StatUtils, on the other hand, I think it makes sense to drop NaNs
> from mode(double[] sample), since in this case we know the arguments
> are doubles and we know what NaN means (or more precisely, what it
> does not mean :)

So the StatUtils method must drop any NaN entries itself.

> Phil
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message