On 6/29/14, 2:30 PM, Gilles wrote:
> On Sun, 29 Jun 2014 10:25:58 0700, Phil Steitz wrote:
>> On 6/29/14, 9:48 AM, venkatesha murthy wrote:
>>> On Sun, Jun 29, 2014 at 1:25 AM, Phil Steitz
>>> <phil.steitz@gmail.com> wrote:
>>>
>>>> On 6/25/14, 12:12 AM, Luc Maisonobe wrote:
>>>>> Le 25/06/2014 06:05, venkatesha murthy a écrit :
>>>>>> On Wed, Jun 25, 2014 at 12:21 AM, Luc Maisonobe
>>>>>> <luc@spaceroots.org>
>>>> wrote:
>>>>>>> Hi Venkat,
>>>>>>>
>>>>>>> Le 23/06/2014 21:08, venkatesha murthy a écrit :
>>>>>>>> On Tue, Jun 24, 2014 at 12:08 AM, Luc Maisonobe
>>>>>>>> <luc@spaceroots.org>
>>>>>>> wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> While looking further in Percentile class for MATH1120,
I
>>>>>>>>> have found
>>>>>>>>> another problem in the current implementation.
>>>>>>>>> NaNStrategy.FIXED
>>>> should
>>>>>>>>> leave the NaNs in place, but at the end of the
>>>>>>>>> KthSelector.select
>>>>>>>>> method, a call to Arrays.sort moves the NaNs to the end
of
>>>>>>>>> the small
>>>>>>>>> subarray. What is really implemented here is therefore
>>>>>>>>> closer to
>>>>>>>>> NaNStrategy.MAXIMAL than NaNStrategy.FIXED. This always
>>>>>>>>> occur in the
>>>>>>>>> test cases because they use very short arrays, and we
>>>>>>>>> directly
>>>> switch to
>>>>>>>>> this part of the select method.
>>>>>>>> Are NaNs considered higher than +Inf ?
>>>>>>>> If MAXIMAL represent replacing for +inf ; you need
>>>>>>>> something to
>>>>>>>> indicate beyond this for NaN.
>>>>>>> Well, we can just keep the NaN themselves and handled them
>>>>>>> appropriately, hoping not to trash performances too much.
>>>>>>>
>>>>>> Agreed.
>>>>>>
>>>>>>>> What is the test input you see an issue and what is the
>>>>>>>> actual error
>>>>>>>> you are seeing. Please share the test case.
>>>>>>> Just look at PercentileTest.testReplaceNanInRange(). The
>>>>>>> first check in
>>>>>>> the test corresponds to a Percentile configuration at 50%
>>>>>>> percentile,
>>>>>>> and NaNStrategy.FIXED. The array has an odd number of
>>>>>>> entries, so the
>>>>>>> 50% percentile is exactly one of the entries: the one at
>>>>>>> index 5 in the
>>>>>>> final array.
>>>>>>>
>>>>>>> The initial ordering is { 0, 1, 2, 3, 4, NaN, NaN, 5, 7,
>>>>>>> NaN, 8 }. So
>>>>>>> for the NaNStrategy.FIXED setting, it should not be modified
>>>>>>> at all in
>>>>>>> the selection algorithm and the result for 50% should be the
>>>>>>> first NaN
>>>>>>> of the array, at index 5. In fact, due to the Arrays.sort,
>>>>>>> we *do*
>>>>>>> reorder the array into { 0, 1, 2, 3, 4, 5, 7, 8, NaN, NaN,
>>>>>>> NaN }, so
>>>>>>> the result is 5.
>>>>>>>
>>>>>>> Agreed. just verified by putting back the earlier insertionSort
>>>> function.
>>>>>>> If we use NaNStrategy.MAXIMAL and any quantile above 67%, we
>>>>>>> get as a
>>>>>>> result Double.POSITIVE_INFINITY instead of Double.NaN.
>>>>>>>
>>>>>>> If we agree to leave FIXED as unchanged behaviour with your
>>>> insertionSort
>>>>>> code; then atleast MAXIMAL/MINIMAL should be allowed for
>>>>>> transformation
>>>> of
>>>>>> NaN to +/Inf
>>>>> I'm fine with it, as long as we clearly state it in the
>>>>> NaNStrategy
>>>>> documentation, saying somethig like:
>>>>>
>>>>> When MAXIMAL/MINIMAL are used, the NaNs are effecively
>>>>> *replaced* by
>>>>> +/ infinity, hence the results will be computed as if
>>>>> infinities were
>>>>> there in the first place.
>>>> 1  this is mixing concerns. NaNStrategy exists for one specific
>>>> purpose  to turn extend the ordering on doubles a total ordering.
>>>> Strategies prescribe only how NaNs are to be treated in the
>>>> ordering. Side effects on the input array don't make sense in
>>>> this
>>>> context. The use case for which this class was created was rank
>>>> transformations. Returning infinities in rank transformations
>>>> would
>>>> blow up computations in these classes. If a floating point
>>>> computations need to transform NaNs, the specific stats / other
>>>> classes that do this transformation should perform and document
>>>> the
>>>> behavior.
>>>>
>>>> Phil
>>>>
>>> OK. Agreed realized it later. Hence i have not touched
>>> NaNStrategy in my
>>> patch(MATH1132) . Now i have added a separate transformer to
>>> transform NaNs
>>> but it uses NanStrategy. Please let me know on this as this
>>> trasnformation
>>> itself
>>> can be used in different places.
>>
>> Honestly, I don't see the value of this. I would be more favorable
>> to an addition to MathArrays supporting NaN (or infinity) filtering
>> / transformation. Several years back we made the decision to just
>> let NaNs propagate in computations, which basically means that if
>> you feed NaNs to [math] you are going to get NaNs out in results.
>> If we do get into the business of filtering NaNs (or infinities for
>> that matter), I think it is probably best to do that in MathArrays
>> or somesuch  i.e., don't decorate APIs with NaN or
>> infinityhandling handling descriptions. That would be a huge task
>> and would likely end up bleeding implementation detail into the
>> APIs. As I said above, NaNStrategy has to exist for rank
>> transformations to be welldefined. NaNhandling in floating point
>> computations is defined and as long as we fully document what our
>> algorithms do, I think it is fair enough to "let the NaNs fall where
>> they may"  which basically means users need to do the filtering /
>> transformation themselves.
>
> So, do I understand correctly that it's OK to add the "remove" and
> "replace" utility methods (cf. MATH1130) in "MathArrays"?
> Or does this thread conclude that we don't have a use for this within
> Commons Math?
You raise a good point here. Personally, I don't see an internal
use and if we are sticking with MathArrays including only things we
have internal use for, I would say no, don't include them.
Phil
>
> Gilles
>
>>
>> Phil
>>>
>>>>> [...]
>
>
> 
> To unsubscribe, email: devunsubscribe@commons.apache.org
> For additional commands, email: devhelp@commons.apache.org
>
>

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
