commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] UnexpectedNegativeIntegerException
Date Sun, 26 Aug 2012 17:42:16 GMT
On 8/26/12 12:20 AM, Luc Maisonobe wrote:
> Le 25/08/2012 23:43, Gilles Sadowski a écrit :
>> Hello Luc.
> Hi Gilles,
>
>> On Fri, Aug 24, 2012 at 09:31:41AM +0200, Luc Maisonobe wrote:
>>> Le 24/08/2012 01:35, Gilles Sadowski a écrit :
>>>> On Thu, Aug 23, 2012 at 12:00:56PM -0700, Phil Steitz wrote:
>>>>> On 8/23/12 5:37 AM, Luc Maisonobe wrote:
>>>>>> Le 23/08/2012 13:37, Gilles Sadowski a écrit :
>>>>>>> On Thu, Aug 23, 2012 at 12:35:18PM +0200, Sébastien Brisard
wrote:
>>>>>>>> Hi Gilles,
>>>>>>>>
>>>>>>>> 2012/8/23 Gilles Sadowski <gilles@harfang.homelinux.org>:
>>>>>>>>> On Thu, Aug 23, 2012 at 10:05:10AM +0200, Sébastien
Brisard wrote:
>>>>>>>>>> Hi Luc,
>>>>>>>>>>
>>>>>>>>>> 2012/8/23 Luc Maisonobe <Luc.Maisonobe@free.fr>:
>>>>>>>>>>> Le 23/08/2012 05:16, Sébastien Brisard a écrit
:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> in MATH-849, I have proposed an implementation
of Gamma(x)
>>>>>>>>>>>> (previously, class Gamma had only logGamma(x)).
Gamma(x) is not
>>>>>>>>>>>> defined for x negative integer. In such instances,
I would like to
>>>>>>>>>>>> throw an exception instead of returning Double.NaN.
However, creating
>>>>>>>>>>>> a new exception in o.a.c.m.exception seems
exagerated, since it's very
>>>>>>>>>>>> unlikely that this exception should be used
elsewhere (or maybe).
>>>>>>>>>>>> Should I define a nested exception instead
[1]?
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think of the name "UnexpectedNegativeIntegerException"?
It
>>>>>>>>>>>> does not really match the pattern of already
defined exceptions, but I
>>>>>>>>>>>> can't find a better name.
>>>>>>>>>>> Don't we already have NotPositiveException?
>>>>>>>>>>>
>>>>>>>>>>> Luc
>>>>>>>>>>>
>>>>>>>>>> We do, but Gamma is defined for all negative values,
except integer ones...
>>>>>>>>> I think that in some circumstances, it might be useful
to not throw
>>>>>>>>> exceptions...
>>>>>>>>> FastMath's "log" returns NaN for negative input.
>>>>>>>>>
>>>>>>>> then I guess that logGamma(x) should also return NaN if x
<= 0?
>>>>>>> Anyways, that's what it does currently.
>>>>>>>
>>>>>>>> I have to say I do not really like this option.
>>>>>>> So did you intend to change that?
>>>>>>>
>>>>>>>> My life would
>>>>>>>> sometimes be much easier if NaNs didn't exist... the good
old days of
>>>>>>>> the "floating-point error".
>>>>> We have different memories - I am old enough to remember wasting
>>>>> real money due to jobs failing on "floating point checks" when I
>>>>> would have preferred to have computations complete and return NaN
>>>>> (which had not been invented yet).
>>>>>
>>>>>>> IIRC NaN could be useful for example in an optimization algorithm;
excerpt
>>>>>>> from Kahan:
>>>>>>> ---
>>>>>>> [...] NaNs is an opportunity ( not obligation ) for software
( especially
>>>>>>> when searching ) to follow an unexceptional path ( no need for
exotic
>>>>>>> control structures ) to a point where an exceptional event can
be appraised
>>>>>>> after the event, when additional evidence may have accrued. [...]
>>>>>>> ---
>>>>>>>
>>>>>>> I do not say that Commons Math should prefer NaN over throwing
exceptions.
>>>>>> If the choice is allowed would really prefer NaN for such cases.
>>>>> +1
>>>>>>> Maybe that it depends on how high-level an application is (i.e.
if there are
>>>>>>> already calls to methods that could throw exceptions, then an
algorithm that
>>>>>>> would not use try/catch to protect itself would fail anyway).
>>>>>>> If we want Commons Math to allow taking advantage of NaNs, it
would probably
>>>>>>> need to be updated so that a lot of precondition checks ought
to be removed
>>>>>>> (but this will likely lead to reduced robustness in some applications
that
>>>>>>> do not do their own checks...).
>>>>>> This would clearly be cumbersome for users.
>>>>>> Since we have changed our exception hierarchy, we don't have a single
>>>>>> root anymore, so users simply cannot catch all exception we throw
at
>>>>>> once, they have to check every single type, and make sure they are
>>>>>> thrown by themselves without any help from compiler.
>>>>>>
>>>>>> Just adding new exceptions is too much, we have already gone too
far
>>>>>> this way.
>>>>> +1 and the fact that all exceptions are unchecked makes it even
>>>>> harder for client apps as the compiler will not help / force them. 
>>>> For those new to this list, they can search the archive for lengthy
>>>> discussions on this subject.
>>>> The summary is that CM has absolutely no use of checked exceptions.
>>> Another really important part is that CM doesn't have anymore a
>>> hierarchy of exception. It has a bunch of completely unrelated
>>> exceptions, all extending different standard Java exceptions. This is
>>> the part that really bothers me, but this is the current consensus.
>> I would be OK to change it to back to the previous state of affairs, that
>> is, the one where we had agreed on a singly rooted hierarchy with base class
>> "MathRuntimeException".
>> The current consensus was reached because you didn't voice the concern you
>> are now mentioning.
> I thought I had. Perhaps this feature was set up after I gave up on this
> discussion.
>
>> It would be quite easy to change, if it would make your life easier.
>>
>> The more so that I never saw what is gained from copying the Java hierachy
>> (in the particular case of the exceptions): Because some exception inherits
>> from the Java standard one does not bring special benefits to the
>> application that has to catch that exception. I mean: Is there any piece of
>> code that would behave differently if it caught "IllegalArgumentException"
>> vs "IllegalStateException"? If not, it could as well be prepared to catch
>> a "MathRuntimeException" (and do the same thing).
>> [The various exception types are primarily there to discriminate between
>> various _problems_; but are not likely helpful to help the caller devise a
>> way to react to the exception once it is raised (other than acknowledge the
>> fact than CM could not perform the requested action).]
>>
>> In CM, the vastly overwhelming majority of exceptions are instances of
>> "MathIllegalArgumentException" or one of its subclasses.
>>
>> We have a "NullArgumentException" but we also agreed that it did not have to
>> be a subclass of the standard Java "NullPointerException". So in this case,
>> we already depart from the "standard". [But we also speculated that the
>> policy could to never check for "null" and let the JVM do that, This behaviour
>> is _not_ consistent throughout CM.]
>>
>> Number of occurrences of CM exceptions that are subclasses of those Java
>> standard exceptions:
>>  * IllegalStateException (43)
>>  * UnsupportedOperationException (22)
>>  * ArithmeticException (54)
>>
>> In summary, I have no problem with a "MathRuntimeException" base class which
>> "MathIllegalArgumentException", "MathIllegalStateException",
>> "MathUnsupportedOperationException", "MathArithmeticException" would inherit
>> from.
>>
>> Applications that call CM would be safe (apart from bugs raising "NPE")
>> with a unique catch clause intercepting "MathRuntimeException".
> I am happy (and surprised) to read that.
> I would really much like to go back to a single root exception
> hierarchy. This both helps top level application as depending on context
> they can either pinpoint the exception they want to catch or they can
> have a grab all strategy. It is their choice.

I like throwing (and catching) standard exceptions instead of
inventing variants of them, which is why I favored having MathIAE
inherit from IAE, etc.  I would have preferred to just throw IAE
directly, but we could not agree on how to do that and preserve
localization, so we ended up with the current setup where we have
custom variants, but they inherit from the standard exceptions.  I
am curious, Luc, about exactly what kinds of use cases will really
be easier / better for users if we go back to a single-rooted
hierarchy.  I get that instead of "catch Exception" or "catch
RuntimeException" you can at the top level "catch MathRTE" and that
will catch only the exceptions that come (at least originally) from
the [math] code.  Can you help me understand via an example how that
is a big benefit that is worth more than being able to "catch IAE"
or "catch IOE" directly?  What I am missing is how knowing that an
aspecific RTE came from within [math] makes a difference.  I am
skeptical about ever depending on that kind of conclusion because
dependencies may bring [math] code in at multiple levels.  Also, is
there an implied assumption in your ideal setup that *no* exceptions
propagate to [math] clients other than MRTE (i.e. we catch and wrap
everything)?

Phil
>
> For sure, this is something that can be done only for a major release.
>
>>>> Client apps cannot do more with checked exceptions, and can be made as
>>>> "safe" by wrapping calls in try-blocks.
>>>> On the other hand, client source code is much cleaner without unnecessary
>>>> "throws" clauses or wrapping of checked expections at all levels.
>>>> Some Java experts go as far as saying that checked exceptions were a
>>>> language design mistake (never repeated in languages invented more
>>>> recently).
>>>>
>>>>> There is a reason that NaNs exist.  It is much cheaper to return a
>>>>> NaN than to raise (and force the client to handle) an exception. 
>>>>> This is not precise and probably can't be made so, but I have always
>>>>> looked at things more or less like this:
>>>>>
>>>>> 0) IAE (which I see no need to specialize as elaborately as we have
>>>>> done in [math]) is for clear violation of the documented API
>>>>> contract.  The actual parameters "don't make sense" in the context
>>>>> of the API.
>>>> The "elaboration" is actually very basic (but that's a matter of taste),
but
>>>> it was primarily promoted (by me) in order to hide (as much as possible)
the
>>>> ugliness (another matter of taste) of the "LocalizedFormats" enum, and its
>>>> inconsequent use (duplication). [Cf. discussions in the archive.]
>>>>
>>>>> 1) NaN can be returned as the result of a computation which, when
>>>>> started with legitimate arguments, does not result in a
>>>>> representable value.
>>>> According to this description, Sébastien's case _must_ be handled by an
>>>> exception: the argument is _not_ legtimate.
>>>> The usage of NaN I was referring to is to let a computation proceed ("follow
>>>> an unexceptional path") in the hope that the final result might still be
>>>> meaningful.
>>>> If the NaN persists, not checking for it and signalling the problem (i.e.
>>>> raise an exception) is a bug. This is to avoid that (and be robust) that
we
>>>> do extensive precondition checks in CM. But this has the unavoidable
>>>> drawback that the use of NaN as suggested is much less likely to be feasible
>>>> when calling CM code. Once realizing that, it becomes much less obvious that
>>>> there is _any_ advantage of letting NaNs propagate...
>>>> [Anyone has an example of NaN usage? Please let me know.]
>>> I use NaN a lot as an indicator that a variable has not been fully
>>> initialized yet. This occurs for example in iterative algorithms, where
>>> some result is computed deep inside some loop and we don't know when the
>>> loop will end. Then I write something along these lines:
>>>
>>>   while (Double.isNaN(result)) {
>>>      // do something that hopefully will change result to non-NaN
>>>   }
>>>
>>>   // now I know result has been computed
>>>
>>> Another use is to initialize some fields in class to values I know are
>>> not meaningful. I can then us NaN as a marker to do lazy evaluation for
>>> values that takes time to compute and should be computed only when both
>>> really needed and when everything required for their computation is
>>> available.
>> I should have said "[...] example of NaN usage, beyond singling out
>> unitialized data [...]". The above makes use of NaN as "invalid" because it
>> is not initialized (yet).
> Yes.
>
>> I'd assume that if "result" stays NaN after the allowed number of
>> iterations, you raise an exception, i.e. you don't propagate NaN as the
>> output of a computation that cannot provide a useful result. However, this
>> (propagating NaN) is the behaviour of "srqt(-1)", for example.
>> Thus, if you raise an exception, your computation does not behave in the
>> same way as the function "sqrt".
>>
>>> Another use is simply to detect some special cases in computations (like
>>> sqrt(-1) or 0/0). I do the computation first and check the NaN
>>> afterwards. See for example the detection of NaNs in the linear
>>> combinations in MathArrays or in the nth order Brent solver.
>> OK, this is a good example, in line with the intended usage of NaN (as it
>> avoids inserting control structures in the computation).
> Yes. One of the main use case for this is when a computation involves a
> loop and failure is very rare. So we avoid costly numerous if statements
> within the loop and do a single check. In the few cases this single
> check fails, we go to a diffrent branch to handle the failure. This is
> exactly what is done in linear combination.
>
>>> Another use of NaNs occurs when integrating various code components from
>>> different origins in a single application. Data is forwarded between the
>>> various components in all directions. Components never share the same
>>> exceptions mechanisms. Components either process NaNs specially (which
>>> is good) or they let the processor propagate them (it is what the IEEE
>>> standard mandates) and at the end you can detect it reliably at
>>> application level.
>> I'm not sure I understand this. Is it good or bad that a component lets NaNs
>> propagate? Are there situations when it's good and others where it's bad?
> In the cases I encountered, it is always good to have NaNs propagated. A
> component that is not an application by itself but only a part (low or
> intermediate level) often cannot decide at its level how to handle NaNs
> except in rare cases. So it propagates them upward. The previous example
> (linear combination in [math]) is of course a counter-example: we are at
> low level, we know how to handle the NaN for this operation, so we
> detect it and fix it.
>
>> That's why I was asking (cf. quote from previous post below) what are the
>> criteria, so that contributors know how to write code when the feature falls
>> in one or the other category.
>>
>>>>> The problem is that contracts can often be written so that instances
>>>>> of 1) are turned into instances of 0).  Gamma(-) is a great
>>>>> example.  The singularities at negative integers could be viewed as
>>>>> making negative integer arguments "illegal" or "nonsense" from the
>>>>> API standpoint,
>>>> They are just nonsense (not just from an API standpoint).
>>>>
>>>>> or legitimate arguments for which no well-defined,
>>>>> representable value can be returned.  Personally, I would prefer to
>>>>> get NaN back from this function and just point out the existence of
>>>>> the singularities in the javadoc.
>>>> This is consistent with how basic math functions behave, but not with the
>>>> general rule/convention of most of CM code.
>>>> It may be fine that we have several ways to deal with exceptional
>>>> conditions, but it might be nice, as with formatting, to have rules so that
>>>> we know how to write contributions.
>>> Too many rules are not a solution, especially when there are no tools to
>>> help enforce these rules are obeyed. Relying only on the fact human
>>> proof-reading will enforce them is wishful thinking.
>>>
>> What is "too many"? ["How long should a person's legs be?" ;-)]
>> I don't agree with the "wishful thinking" statement; a "diff" could probably
>> show a lot a manual corrections to the code and comment formatting. [Mainly
>> in the sources which I touched at some point...]
> I'm not sure I understand your point. Mine is that rules that are not
> backed by automated tools are a pain to enforce, and hence are not
> fulfilled most of the time, except at a tremendous human resource cost.
> In fact, even rules which can be associated with tools are broken during
> development for some time. We do not use
> checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example, but do a
> fix pass from time to time.
>
>> There are other areas where there is only human control, namely the "svn
>> log" messages where (no less picky) rules are enforced just because it
>> helps _humans_ in their change overview task.
>>
>> As pointed out by Jared, it's not a big problem to comply with rules once
>> you know them.
> I fully agree with that, but I also think Phil is right when he says too
> many rules may discourage potential contributors. I remember a link he
> sent to us months ago about to a presentation by Michael Meeks about
> interacting with new developers
> <http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>.
> Slides numers 3 an 4 are a fabulous example. I think we are lucky Jared
> has this state of mind and accepts picky rules easily. I'm not sure such
> an open mind is widespread among potential contributors.
>
>> Keeping source code tidy is quite helpful, and potential contributors will
>> be happy that they can read any CM source files and immediately recognize
>> that they are part of the same library...
> Yes, of course. But the entry barrier should not be too high.
>
> best regards,
> Luc
>
>>
>> Best regards,
>> Gilles
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message