commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <Luc.Maison...@free.fr>
Subject Re: [math] UnexpectedNegativeIntegerException
Date Sun, 26 Aug 2012 07:20:25 GMT
Le 25/08/2012 23:43, Gilles Sadowski a écrit :
> Hello Luc.

Hi Gilles,

> 
> On Fri, Aug 24, 2012 at 09:31:41AM +0200, Luc Maisonobe wrote:
>> Le 24/08/2012 01:35, Gilles Sadowski a écrit :
>>> On Thu, Aug 23, 2012 at 12:00:56PM -0700, Phil Steitz wrote:
>>>> On 8/23/12 5:37 AM, Luc Maisonobe wrote:
>>>>> Le 23/08/2012 13:37, Gilles Sadowski a écrit :
>>>>>> On Thu, Aug 23, 2012 at 12:35:18PM +0200, Sébastien Brisard wrote:
>>>>>>> Hi Gilles,
>>>>>>>
>>>>>>> 2012/8/23 Gilles Sadowski <gilles@harfang.homelinux.org>:
>>>>>>>> On Thu, Aug 23, 2012 at 10:05:10AM +0200, Sébastien Brisard
wrote:
>>>>>>>>> Hi Luc,
>>>>>>>>>
>>>>>>>>> 2012/8/23 Luc Maisonobe <Luc.Maisonobe@free.fr>:
>>>>>>>>>> Le 23/08/2012 05:16, Sébastien Brisard a écrit
:
>>>>>>>>>>> Hi,
>>>>>>>>>>> in MATH-849, I have proposed an implementation
of Gamma(x)
>>>>>>>>>>> (previously, class Gamma had only logGamma(x)).
Gamma(x) is not
>>>>>>>>>>> defined for x negative integer. In such instances,
I would like to
>>>>>>>>>>> throw an exception instead of returning Double.NaN.
However, creating
>>>>>>>>>>> a new exception in o.a.c.m.exception seems exagerated,
since it's very
>>>>>>>>>>> unlikely that this exception should be used elsewhere
(or maybe).
>>>>>>>>>>> Should I define a nested exception instead [1]?
>>>>>>>>>>>
>>>>>>>>>>> What do you think of the name "UnexpectedNegativeIntegerException"?
It
>>>>>>>>>>> does not really match the pattern of already
defined exceptions, but I
>>>>>>>>>>> can't find a better name.
>>>>>>>>>> Don't we already have NotPositiveException?
>>>>>>>>>>
>>>>>>>>>> Luc
>>>>>>>>>>
>>>>>>>>> We do, but Gamma is defined for all negative values,
except integer ones...
>>>>>>>> I think that in some circumstances, it might be useful to
not throw
>>>>>>>> exceptions...
>>>>>>>> FastMath's "log" returns NaN for negative input.
>>>>>>>>
>>>>>>> then I guess that logGamma(x) should also return NaN if x <=
0?
>>>>>> Anyways, that's what it does currently.
>>>>>>
>>>>>>> I have to say I do not really like this option.
>>>>>> So did you intend to change that?
>>>>>>
>>>>>>> My life would
>>>>>>> sometimes be much easier if NaNs didn't exist... the good old
days of
>>>>>>> the "floating-point error".
>>>>
>>>> We have different memories - I am old enough to remember wasting
>>>> real money due to jobs failing on "floating point checks" when I
>>>> would have preferred to have computations complete and return NaN
>>>> (which had not been invented yet).
>>>>
>>>>>> IIRC NaN could be useful for example in an optimization algorithm;
excerpt
>>>>>> from Kahan:
>>>>>> ---
>>>>>> [...] NaNs is an opportunity ( not obligation ) for software ( especially
>>>>>> when searching ) to follow an unexceptional path ( no need for exotic
>>>>>> control structures ) to a point where an exceptional event can be
appraised
>>>>>> after the event, when additional evidence may have accrued. [...]
>>>>>> ---
>>>>>>
>>>>>> I do not say that Commons Math should prefer NaN over throwing exceptions.
>>>>> If the choice is allowed would really prefer NaN for such cases.
>>>>
>>>> +1
>>>>>
>>>>>> Maybe that it depends on how high-level an application is (i.e. if
there are
>>>>>> already calls to methods that could throw exceptions, then an algorithm
that
>>>>>> would not use try/catch to protect itself would fail anyway).
>>>>>> If we want Commons Math to allow taking advantage of NaNs, it would
probably
>>>>>> need to be updated so that a lot of precondition checks ought to
be removed
>>>>>> (but this will likely lead to reduced robustness in some applications
that
>>>>>> do not do their own checks...).
>>>>> This would clearly be cumbersome for users.
>>>>> Since we have changed our exception hierarchy, we don't have a single
>>>>> root anymore, so users simply cannot catch all exception we throw at
>>>>> once, they have to check every single type, and make sure they are
>>>>> thrown by themselves without any help from compiler.
>>>>>
>>>>> Just adding new exceptions is too much, we have already gone too far
>>>>> this way.
>>>>
>>>> +1 and the fact that all exceptions are unchecked makes it even
>>>> harder for client apps as the compiler will not help / force them. 
>>>
>>> For those new to this list, they can search the archive for lengthy
>>> discussions on this subject.
>>> The summary is that CM has absolutely no use of checked exceptions.
>>
>> Another really important part is that CM doesn't have anymore a
>> hierarchy of exception. It has a bunch of completely unrelated
>> exceptions, all extending different standard Java exceptions. This is
>> the part that really bothers me, but this is the current consensus.
> 
> I would be OK to change it to back to the previous state of affairs, that
> is, the one where we had agreed on a singly rooted hierarchy with base class
> "MathRuntimeException".
> The current consensus was reached because you didn't voice the concern you
> are now mentioning.

I thought I had. Perhaps this feature was set up after I gave up on this
discussion.

> 
> It would be quite easy to change, if it would make your life easier.
> 
> The more so that I never saw what is gained from copying the Java hierachy
> (in the particular case of the exceptions): Because some exception inherits
> from the Java standard one does not bring special benefits to the
> application that has to catch that exception. I mean: Is there any piece of
> code that would behave differently if it caught "IllegalArgumentException"
> vs "IllegalStateException"? If not, it could as well be prepared to catch
> a "MathRuntimeException" (and do the same thing).
> [The various exception types are primarily there to discriminate between
> various _problems_; but are not likely helpful to help the caller devise a
> way to react to the exception once it is raised (other than acknowledge the
> fact than CM could not perform the requested action).]
> 
> In CM, the vastly overwhelming majority of exceptions are instances of
> "MathIllegalArgumentException" or one of its subclasses.
> 
> We have a "NullArgumentException" but we also agreed that it did not have to
> be a subclass of the standard Java "NullPointerException". So in this case,
> we already depart from the "standard". [But we also speculated that the
> policy could to never check for "null" and let the JVM do that, This behaviour
> is _not_ consistent throughout CM.]
> 
> Number of occurrences of CM exceptions that are subclasses of those Java
> standard exceptions:
>  * IllegalStateException (43)
>  * UnsupportedOperationException (22)
>  * ArithmeticException (54)
> 
> In summary, I have no problem with a "MathRuntimeException" base class which
> "MathIllegalArgumentException", "MathIllegalStateException",
> "MathUnsupportedOperationException", "MathArithmeticException" would inherit
> from.
> 
> Applications that call CM would be safe (apart from bugs raising "NPE")
> with a unique catch clause intercepting "MathRuntimeException".

I am happy (and surprised) to read that.
I would really much like to go back to a single root exception
hierarchy. This both helps top level application as depending on context
they can either pinpoint the exception they want to catch or they can
have a grab all strategy. It is their choice.

For sure, this is something that can be done only for a major release.

> 
>>
>>> Client apps cannot do more with checked exceptions, and can be made as
>>> "safe" by wrapping calls in try-blocks.
>>> On the other hand, client source code is much cleaner without unnecessary
>>> "throws" clauses or wrapping of checked expections at all levels.
>>> Some Java experts go as far as saying that checked exceptions were a
>>> language design mistake (never repeated in languages invented more
>>> recently).
>>>
>>>> There is a reason that NaNs exist.  It is much cheaper to return a
>>>> NaN than to raise (and force the client to handle) an exception. 
>>>> This is not precise and probably can't be made so, but I have always
>>>> looked at things more or less like this:
>>>>
>>>> 0) IAE (which I see no need to specialize as elaborately as we have
>>>> done in [math]) is for clear violation of the documented API
>>>> contract.  The actual parameters "don't make sense" in the context
>>>> of the API.
>>>
>>> The "elaboration" is actually very basic (but that's a matter of taste), but
>>> it was primarily promoted (by me) in order to hide (as much as possible) the
>>> ugliness (another matter of taste) of the "LocalizedFormats" enum, and its
>>> inconsequent use (duplication). [Cf. discussions in the archive.]
>>>
>>>> 1) NaN can be returned as the result of a computation which, when
>>>> started with legitimate arguments, does not result in a
>>>> representable value.
>>>
>>> According to this description, Sébastien's case _must_ be handled by an
>>> exception: the argument is _not_ legtimate.
>>> The usage of NaN I was referring to is to let a computation proceed ("follow
>>> an unexceptional path") in the hope that the final result might still be
>>> meaningful.
>>> If the NaN persists, not checking for it and signalling the problem (i.e.
>>> raise an exception) is a bug. This is to avoid that (and be robust) that we
>>> do extensive precondition checks in CM. But this has the unavoidable
>>> drawback that the use of NaN as suggested is much less likely to be feasible
>>> when calling CM code. Once realizing that, it becomes much less obvious that
>>> there is _any_ advantage of letting NaNs propagate...
>>> [Anyone has an example of NaN usage? Please let me know.]
>>
>> I use NaN a lot as an indicator that a variable has not been fully
>> initialized yet. This occurs for example in iterative algorithms, where
>> some result is computed deep inside some loop and we don't know when the
>> loop will end. Then I write something along these lines:
>>
>>   while (Double.isNaN(result)) {
>>      // do something that hopefully will change result to non-NaN
>>   }
>>
>>   // now I know result has been computed
>>
>> Another use is to initialize some fields in class to values I know are
>> not meaningful. I can then us NaN as a marker to do lazy evaluation for
>> values that takes time to compute and should be computed only when both
>> really needed and when everything required for their computation is
>> available.
> 
> I should have said "[...] example of NaN usage, beyond singling out
> unitialized data [...]". The above makes use of NaN as "invalid" because it
> is not initialized (yet).

Yes.

> I'd assume that if "result" stays NaN after the allowed number of
> iterations, you raise an exception, i.e. you don't propagate NaN as the
> output of a computation that cannot provide a useful result. However, this
> (propagating NaN) is the behaviour of "srqt(-1)", for example.
> Thus, if you raise an exception, your computation does not behave in the
> same way as the function "sqrt".
> 
>> Another use is simply to detect some special cases in computations (like
>> sqrt(-1) or 0/0). I do the computation first and check the NaN
>> afterwards. See for example the detection of NaNs in the linear
>> combinations in MathArrays or in the nth order Brent solver.
> 
> OK, this is a good example, in line with the intended usage of NaN (as it
> avoids inserting control structures in the computation).

Yes. One of the main use case for this is when a computation involves a
loop and failure is very rare. So we avoid costly numerous if statements
within the loop and do a single check. In the few cases this single
check fails, we go to a diffrent branch to handle the failure. This is
exactly what is done in linear combination.

> 
>>
>> Another use of NaNs occurs when integrating various code components from
>> different origins in a single application. Data is forwarded between the
>> various components in all directions. Components never share the same
>> exceptions mechanisms. Components either process NaNs specially (which
>> is good) or they let the processor propagate them (it is what the IEEE
>> standard mandates) and at the end you can detect it reliably at
>> application level.
> 
> I'm not sure I understand this. Is it good or bad that a component lets NaNs
> propagate? Are there situations when it's good and others where it's bad?

In the cases I encountered, it is always good to have NaNs propagated. A
component that is not an application by itself but only a part (low or
intermediate level) often cannot decide at its level how to handle NaNs
except in rare cases. So it propagates them upward. The previous example
(linear combination in [math]) is of course a counter-example: we are at
low level, we know how to handle the NaN for this operation, so we
detect it and fix it.

> That's why I was asking (cf. quote from previous post below) what are the
> criteria, so that contributors know how to write code when the feature falls
> in one or the other category.
> 
>>
>>>
>>>> The problem is that contracts can often be written so that instances
>>>> of 1) are turned into instances of 0).  Gamma(-) is a great
>>>> example.  The singularities at negative integers could be viewed as
>>>> making negative integer arguments "illegal" or "nonsense" from the
>>>> API standpoint,
>>>
>>> They are just nonsense (not just from an API standpoint).
>>>
>>>> or legitimate arguments for which no well-defined,
>>>> representable value can be returned.  Personally, I would prefer to
>>>> get NaN back from this function and just point out the existence of
>>>> the singularities in the javadoc.
>>>
>>> This is consistent with how basic math functions behave, but not with the
>>> general rule/convention of most of CM code.
>>> It may be fine that we have several ways to deal with exceptional
>>> conditions, but it might be nice, as with formatting, to have rules so that
>>> we know how to write contributions.
>>
>> Too many rules are not a solution, especially when there are no tools to
>> help enforce these rules are obeyed. Relying only on the fact human
>> proof-reading will enforce them is wishful thinking.
>>
> 
> What is "too many"? ["How long should a person's legs be?" ;-)]
> I don't agree with the "wishful thinking" statement; a "diff" could probably
> show a lot a manual corrections to the code and comment formatting. [Mainly
> in the sources which I touched at some point...]

I'm not sure I understand your point. Mine is that rules that are not
backed by automated tools are a pain to enforce, and hence are not
fulfilled most of the time, except at a tremendous human resource cost.
In fact, even rules which can be associated with tools are broken during
development for some time. We do not use
checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example, but do a
fix pass from time to time.

> 
> There are other areas where there is only human control, namely the "svn
> log" messages where (no less picky) rules are enforced just because it
> helps _humans_ in their change overview task.
> 
> As pointed out by Jared, it's not a big problem to comply with rules once
> you know them.

I fully agree with that, but I also think Phil is right when he says too
many rules may discourage potential contributors. I remember a link he
sent to us months ago about to a presentation by Michael Meeks about
interacting with new developers
<http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>.
Slides numers 3 an 4 are a fabulous example. I think we are lucky Jared
has this state of mind and accepts picky rules easily. I'm not sure such
an open mind is widespread among potential contributors.

> Keeping source code tidy is quite helpful, and potential contributors will
> be happy that they can read any CM source files and immediately recognize
> that they are part of the same library...

Yes, of course. But the entry barrier should not be too high.

best regards,
Luc

> 
> 
> Best regards,
> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message