commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Brisard <sebastien.bris...@m4x.org>
Subject Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Date Sat, 16 Jun 2012 07:38:19 GMT
Hi Gilles,

2012/6/16 Gilles Sadowski <gilles@harfang.homelinux.org>:
> Hi.
>
>> >>
>> >> > [...]
>> >> >
>> >> >> I fully agree. We could also opt for a less correct, but more
>> >> >> efficient solution: we do not store the sign of zero, and return
NaN
>> >> >> each time v / zero occurs. The result should be NaN anyway, because
>> >> >> its sign is undecidable. This specificity would be clearly stated
in
>> >> >> the javadoc. What do you think?
>> >> >
>> >> > We had this sort of discussion with the "Complex" class, where there
are
>> >> > such issues as "standard" vs "correct" vs "documented" etc. E.g. some
>> >> > computations return "NaN" where it should be "infinity" (or vice-versa,
I
>> >> > don't remember exactly). See MATH-667:
>> >> >  https://issues.apache.org/jira/browse/MATH-667
>> >> >
>> >> > Maybe that we must be guided with some use-cases for this class.
>> >> > Maybe in actual uses, the problems discussed here do not appear, and
any
>> >> > additional check would destroy the usefulness of this class in such
cases
>> >> > (just a wild guess). Maybe that we should drop support for sparse vectors!
>> >> >
>> >> I fully agree with you. Speaking quite honestly as a user of the
>> >> linear package, even if I know that IEEE floats handle infinite values
>> >> well, I tend to dish results that contain those values, and try to
>> >> cure the cause of these unwanted values. So, as a user, whether the
>> >> method returns NaN of infinity does not matter to me (as long as it
>> >> does not return zero --or any normal value-- quietly): in any case, I
>> >> would do a computation again, this time with "correct" values.
>> >>
>> >> I also think that in this case, strict adherence to IEEE standards
>> >> (which were written for scalar values, not vectors) should not
>> >> compromise the performance of the class.
>> >
>> > Am I mistaken, or is this at odds with a fix that will make the "division"
>> > test pass? [I.e. If there is no way to distinguish between +0 and -0 as the
>> > default entry then "1 / default" will always be "+inf".]
>> >
>>
>> Do you mean I am bending the rules in favour of this specific case?
>
> I'm just referring to that failure:
> ---
> java.lang.AssertionError: entry #14, left = 1.0, right = -0.0 expected:<-Infinity>
but was:<Infinity>
> ---
>
> I.e. you can never make this test pass if you do not keep the sign.
>
Agreed

>
>> In
>> fact, it could be argued that the returned value *should* be NaN,
>> since its sign is not decidable. Isn't that the reason why NaNs were
>> created in the first place? If we clearly state that for sparse
>> vectors with zero default value, the sign is lost, then the logical
>> conclusion is that ebeDivide should return NaN instead of infinity.
>> This is only my point of view, and I'm quite happy with any decision
>> (since none is going to be perfect...).
>
> I'm not convinced; I'd rather throw an exception.
>
OK, that's an option.

>> >>
>> >> I would be more cautious with your suggestion about getting rid of
>> >> sparse vectors. I think it's nice to have them, even if the support is
>> >> limited (we should of course concentrate on array-based vectors
>> >> first). As long as we clearly document these limits, I'd like to keep
>> >> them. Besides, with the new set of abstract tests being constructed
>> >> for any vector class, it should be possible in the future to propose
>> >> and test a better implementation. The problem we have is clearly lack
>> >> of feedback on these features (remember the debates we had on the
>> >> "sparseIterators"-- not perfect, but experience only can tell how to
>> >> improve).
>> >
>> > Well, if we know that it doesn't work well and we don't know how to fix it
>> > while at the same time keep its usefulness (efficiency-wise), we have to
>> > raise the issue of the necessity to keep supporting it even though it seems
>> > that nobody uses it.
>> >
>> > For the time being (staying compatible), we should at least fix the failing
>> > unit test problem; hence decide for example (as you said before) that the
>> > purpose is to save memory, but the trade-off is loss of performance.
>> >
>> Yes, but the problem is that the inefficient fix has already been
>> implemented... And it does not fix all bugs.
>> Anyway, we could probably aim at the most correct solution (still to
>> be defined!), at the expense of performance, since a more efficient
>> implementation can easily be implemented through the newly implemented
>> visitor pattern.
>
>
> If the visitor can compute a correct answer more efficiently, then IMO we
> should readily deprecate the method and remove it ASAP.
>
Sorry, that wasn't what I meant. I meant that a visitor can compute a
less correct answer more efficiently. So if users can cope with
special values incorrectly handled, this opportunity is offered to
them, while the default implementation is more correct.

>>
>> > Maybe that we could also impose that it is forbidden to divide by a sparse
>> > vector whose default value is zero. This will avoid a check on all entries
>> > (at the expense of forbidding legitimate divisions, when all the entries
>> > have a non-default value; but then one would wonder why using a sparse
>> > vector...).
>> >
>> That's also an idea to explore. I have to say I still do not see
>> clearly what we should do. I quickly looked at BLAS, they apparently
>> do not provide an element-by-element division. I should look at
>> octave, scilab, numpy...
>
>
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>

Gilles, I have the feeling that your prefered option is to store the
sign of the default value. In theory, that would solve the problem. I
see two objections
1. sparse vectors would be less sparse (most users do not care about
the sign of the zero entries!)
2. there is a threshold to decided whether or not an entry is actually
zero. So the current implementation already assumes the sign is lost.

I'm really confused about the best solution to adopt.

Sébastien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message