commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [Math] MATH-1129
Date Wed, 18 Jun 2014 12:32:31 GMT
Hello Luc.

>>
>> See
>>   https://issues.apache.org/jira/browse/MATH-1129
>>
>> The problem reported was due to the sorting procedure not behaving
>> correctly in the presence of NaN.
>> This procedure could be replaced by an equivalent method from the 
>> JDK:
>>   java.util.Arrays.sort(double[],int,int)
>>
>> Any objection?
>
> If you imply removing the select, medianOf3 and partition methods, 
> yes I
> would object. If you imply replacing only the insertionSort method 
> used
> for small sub-arrays, then I almost agree.

Issue 1129 concerns the latter: See the comment in "insertionSort" in 
the
current code.

However, for the former, you should really have a look at MATH-1120
   https://issues.apache.org/jira/browse/MATH-1120
The proposed patch indeed touches those methods.

>
> However, I'm not sure this would be sufficient as the sort is done
> partially in all the methods above. The former ones are used to split
> the big array into smaller ones and sorting only the sub-arrays that 
> are
> needed to compute the percentile (it is really a selection algorithm,
> not a full-sort algorithm). So I guess we should look at all the 
> methods
> at once to ensure proper handling of NaNs. The trick is that all
> comparisons involving NaN return false (<, <=, >, >=, ==, !=),
> regardless of the NaN being at the left hand side or right hand side 
> of
> the comparison (so we get the funny consequence that if a is a NaN, 
> the
> test a == a evaluates to false ...).

As of r1603217, "insertionSort" seems to behave correctly (at least on 
the
example reported in MATH-1129).

>
> The fast an insertion sort is used at the end is because it is very
> simply and efficient for small arrays, which is exactly what happens
> here: the array part has MIN_SELECT_SIZE elements or less, i.e. 15
> elements or less.

That was my question, implicitly: is it worth not using the JDK, or 
IOW,
is "insertionSort" so much faster than "java.util.Arrays.sort" that it 
is
worth maintaining a custom implementation?

>
> As I wrote this selection algorithm, I could do the check if you 
> want.

What check?


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message