commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tompkins (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MATH-1233) Uncommon wilcoxon signed-rank p-values
Date Sun, 30 Apr 2017 18:39:04 GMT

    [ https://issues.apache.org/jira/browse/MATH-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990332#comment-15990332
] 

Rob Tompkins edited comment on MATH-1233 at 4/30/17 6:38 PM:
-------------------------------------------------------------

After some reading here, the assumptions on the data given are:

# Data are paired and come from the same population.
# Each pair is chosen randomly and independently.
# The data are measured at least on an ordinal scale (i.e., they cannot be nominal).

I wonder if the two input vectors are the same, then is the consumer not violating 3? I generally
agree here that the same vector should be treated in its own way. I would think that we may
want to throw an exception. The only question then becomes performance in nature, in that,
is doing array equality at the beginning of the procedure valuable enough that we are willing
to do it every time despite the _O( n )_ performance hit? Or do we simply document the fact
that we'll not give reliable results when the vectors are the same.


was (Author: chtompki):
After some reading here, the assumptions on the data given are:

# Data are paired and come from the same population.
# Each pair is chosen randomly and independently.
# The data are measured at least on an ordinal scale (i.e., they cannot be nominal).

I wonder if the two input vectors are the same, if we are not violating 3. I generally agree
here that the same vector should be treated in its own way. I would think that we may want
to throw an exception. The only question then becomes performance in nature, in that, is doing
array equality at the beginning of the procedure valuable enough that we are willing to do
it every time despite the _O( n )_ performance hit? Or do we simply document the fact that
we'll not give reliable results when the vectors are the same.

> Uncommon wilcoxon signed-rank p-values
> --------------------------------------
>
>                 Key: MATH-1233
>                 URL: https://issues.apache.org/jira/browse/MATH-1233
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Icaro Cavalcante Dourado
>             Fix For: 4.0
>
>         Attachments: MATH-1233-test.patch
>
>
> This implementation in WilcoxonSignedRankTest looks weird. For equal vectors, the correct
pValue should be 1, because it is the probability of the vectors to come from same population.
> On the opposite, this implementation returns ~0 for equal vectors. So we need to analyze
the returned pValue > significanceLevel to reject H0 hypothesis, while in R and many others
tools we perform the opposite: pValue <= significanceLevel gives us an argument to reject
null hypothesis.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message