commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <Luc.Maison...@free.fr>
Subject Re: [math] UnexpectedNegativeIntegerException
Date Sat, 01 Sep 2012 12:55:20 GMT
Le 29/08/2012 22:02, Gilles Sadowski a écrit :
> Hello.
> 
>>>> [...]
>>>> I encountered this need in two different cases. The first one was to
>>>> identify very precisely an error type, even with finer granularity than
>>>> exception type. Parsing the error message to recognize the exception is
>>>> evil, checking the enumerate used in the pattern is less evil. The
>>>> second case was when I needed to create a more elaborate message by
>>>> combining some information provided by the caller, and some information
>>>> extracted from the exception. Here again, parsing is evil but getting
>>>> the parameters is fine.
>>>
>>> Maybe you missed my point (same as above), as it applies here too: You can
>>> get the parameters through the accessors (of the specific exception types).
>>> We created the "context" so that additional parameters can be set and
>>> retrieved ("key/value" pairs). I still do not understand why one should
>>> resort to extract something from the pattern.
>>> [The pattern is unfortunately "public" whereas it should be an
>>> "implementation detail".]
>>
>> I don't "extract" something from the pattern, I just check if it is a
>> known and expected enumerate I want to handle specifically or something
>> else.
> 
> Maybe I should see the actual code before we go on discussing this. Of
> course you are free to do whatever the API of CM lets you do. I just have
> the feeling that I would do it otherwise... :-}

Here is an example I encountered two minutes ago, while working on
adding the exception throws statements in the ode package.

The computeParameterJacobian may throw a MathIllegalArgumentException
(in the current subversion repository, I have seen it throws an
IllegalArgumentException, which is wrong). In the following piece of
code, I need to check the exception I get is really a
LocalizedFormats.UNKNOWN_PARAMETER or something else:


  delayedExcpetion = null;
  for (ParameterJacobianProvider provider: jacobianProviders) {

    try {

      provider.computeParameterJacobian(...);
      return;

    } catch (MathIllegalArgumentException miae) {
      List<Localizable> patterns = miae.getContext().getPatterns();
      if (patterns.contains(LocalizedFormats.UNKNOWN_PARAMETER)) {
        // temporarily ignore, until we have checked all providers
        delayedException = miae;
      } else {
        // this is another problem, report it
        throw miae;
      }
    }

  }

  if (delayedException != null) {
    // none of the providers did handle the parameters
    throw miae;
  }

Here, I only use only the pattern in the check, sometimes I need to also
check the parameters (for example here I could use the name of the
parameter). Note that the exception thrown here is a high level
MathIllegalArgumentException, not a specialized exception like
NumberIsTooLarge which does have specialized getters like getMax.

So for this kind of use, I need getters in the context class
(getPatterns and getParameters). The alternative would be to create
specialized exceptions for every single LocalizedFormats we have, which
is clearly too much.

best regards,
Luc

> 
>>>
>>>>
>>>>>
>>>>>> It is not possible to catch ExceptionContextProvider because it
>>>>>> is not a throwable (Throwable is a class, not an interface, so we
>>>>>> inherit the Throwable nature from the top level class, not as
>>>>>> implementing the ExceptionContextProvider interface.
>>>>>
>>>>> This should be sowewhat alleviated in Java 7, since it is possible to
catch
>>>>> many exceptions in the same clause. Of course, it doesn't help if
>>>>> applications are stuck to Java 5... :-}
>>>>
>>>> You should be happy we do not support Java 1.3 anymore.
>>>
>>> Yes. Thank you! ;-)
>>> [I'm not going to start another inner thread...]
>>>
>>>>
>>>>>
>>>>>>
>>>>>> The second useful thing is for [math] development itself. With a
single
>>>>>> root, we can temporarily change its parent class from RuntimeException
>>>>>> to Exception, then fix all missing throws declaration and javadoc,
then
>>>>>> put the parent class back before committing. This would help having
more
>>>>>> up to date declarations. For now, I am sure we have missed a lot
of our
>>>>>> own exceptions and let them propagate upward without anybody knowing
it.
>>>>>> As a test, I have just changed the parent for
>>>>>> MathIllegalArgumentException to Exception. I got 1384 compilation
>>>>>> errors. Just going to the first one (a constructor of
>>>>>> BaseAbstractUnivariateIntegrator), I saw we did not advertise the
fact
>>>>>> it may throw NumberIsTooSmallException and NotStrictlyPositiveException,
>>>>>> neither in a throws declaration nor in the javadoc. I did not look
at
>>>>>> the 1383 other errors...
>>>>>
>>>>> I'm -1 to consider this as something to fix. Quite the opposite: in
>>>>> "standard" Java code, runtime exceptions must not appear in throws clauses
>>>>> (cf. mail archive for rationale and references).
>>>>> I understood that this switch to checked exceptions is part of your work
>>>>> cycle, but it has nothing to do with the library being well implemented.
>>>>>
>>>>> I do agree that the Javadoc is supposed to document all thrown exceptions.
>>>>
>>>> The process I explained simplifies this.
>>>
>>> My view is that the documentation being one important component of a
>>> contribution to CM, a patch should not be committed if it lacks in this
>>> respect. Yes, this is a rule that might potentially turn away contributors,
>>> but it's worth the time you or I would have to spend afterwards in order to
>>> clean up the mess.
>>> When we enforce good quality contributions, you don't have to resort to this
>>> "switch" trick (which cannot always work because checked and unchecked
>>> exceptions are not interchangeable, _conceptually_).
>>
>> The trick does is not intended to lead to working code, it is a dirty
>> hack around compiler checks for throws declarations and checkstyle
>> checks for javadoc matching throws declarations. Its goal is only to
>> identify missing parts. Once errors are resolved, of course we go back
>> to unchecked root and then we get code that should work. The
>> intermedaite code should only compile, it is not intended to be run.
> 
> Thanks for repeating the detailed explanation; I had understood that. My
> point was that my preferred style is to _not_ have "throws" clauses in the
> method signature. When following that style it is impossible to do your
> above check (that's why you have had so many compilation "errors").
> 
> But I'm ready to change my mind, for the sake of ensuring consistency and
> completeness of the CM documentation, at the cost of duplicating the
> information about the exceptions, but with the benefit that Luc's use-case
> is satisfied (i.e. an application developer can be 100% sure that all
> exceptions potentially raised by CM are indeed advertised as such).
> 
> Let's add this _rule_ to our "must read" document for old and new
> contributors! ;-)
> 
>>>
>>>>
>>>>>
>>>>> And I certainly do agree that,
>>>>>  1. if applications must catch all exceptions and
>>>>>  2. they must display them differently according to the component that
threw
>>>>>     them,
>>>>> then having a singly rooted hierarchy is nicer to the application developer.
>>>>>
>>>>> [If only the first condition holds, I don't see what's the problem with
>>>>> putting the whole code inside a "try" block and catch "Exception".]
>>>>
>>>> Intermediate level applications are not allowed to catch Exception. They
>>>> too use checkstyle and findbugs, and catching exception is bad for them too.
>>>
>>> OK, that's a fair argument. I just wanted to play the devil's advocate, for
>>> multiple hierarchies!
>>>
>>>>
>>>>>
>>>>> As during the dicussions that followed my proposal to get rid of checked
>>>>> exceptions, there seems to still be a confusion with the purpose of either
>>>>> category (checked vs unchecked): a runtime exception is _not_ some kind
of
>>>>> control structure, like "I cannot perform the operation; please try again
>>>>> later"; rather it is informing the caller of a permanent failure: "I
will
>>>>> never be able to perform the operation; please don't try again".
>>>>> [Checked exceptions were meant for the former: e.g. when trying to write
to
>>>>> a file, you get a "Disk is full" error and the caller can act (e.g. deleting
>>>>> some files) so that a subsequent call can succeed. I contend that CM
does
>>>>> not have any such procedures, nor should be trying to handle "retry"
>>>>> scenarios.]
>>>>
>>>> It's not as simple as this. When we detect a problem at [math] level, we
>>>> cannot say the problem is something that cannot be fixed, we don't even
>>>> know what the user tried to do.
>>>>
>>>> One example I often use is launch window computation for rockets. One
>>>> way to compute them is brute force simulation: you simply try all launch
>>>> times and for each one you perform the simulation of the early orbits
>>>> phase. If at any time something breaks (a polynomial that never crosses
>>>> zero, an optimizer that fails to converge, a square root from a negative
>>>> number ...), you simply know you have exceeded the launch window
>>>> boundary and you try another date to identify the limit. The
>>>> computations are far too complex to be able to say before hand using
>>>> preconditions: this launch time will not work (it is an example of the
>>>> Turing halting problem). Also the fact a computation totally fails in
>>>> this context is not an indication something is wrong in what the user
>>>> asked for, the user was especially looking for such failures! This
>>>> example is rather specific, I admit it, but the point is that at low
>>>> level, you cannot decide this is a problem that should stop the
>>>> application so I can trigger a runtime exception and this other case is
>>>> a problem the user can handle and I should trigger a regular exception.
>>>
>>> Thanks for this illustrative example.
>>> But it does not contradict the spirit of what I wrote. It's "clear" that
>>> making sure that such a complex task will not fail is at least as complex as
>>> the task itself. However, the caller knows that, and, as you explained, even
>>> expects that failures will happen. Hence, it wraps whatever calls are
>>> expected to fail within a "try"-block, catch the exception and decides what
>>> to do: abort the application, or go to the next set of parameters.
>>
>> Yes, this is what is done.
>>
>>> What I
>>> meant is that the policy is the caller's sole responsibility; there is
>>> nothing CM can do beyond saying "that call did not succeed". In the language
>>> of your example: There is no way that CM would be able to say "try another
>>> launch time". Thus even if from the user's viewpoint, there is nothing wrong
>>> with getting an exception, from the CM developer's viewpoint, the code was
>>> nevertheless called inappropriately.
>>
>> Yes.
>>
>>>
>>>>>
>>>>> To be hopefully clear (referring to your remark about
>>>>> "MathIllegalArgumentException"): Wrong arguments are the cause of the
>>>>> failure and it is the role of the _caller_ to prevent this from happening.
>>
>> Yes, but the point is that he can expect only exception that have been
>> documented to be thrown. He cannot guess by himself some
>> NumberIsTooSmallException is hidden deeply in the [math] code if the
>> public method he calls does not advertise it. So he may miss this
>> exception in his try/catch block.
> 
> That's clear.
> 
> I still wonder how an application level call, that has to traverse many
> layers until it reaches a method that happens to throw an unadvertised
> exception, can figure out distinctive handling for this vs another
> exception; it seems to me that it should catch the base exceptions (or the
> single base exception in a singly rooted hierarchy).
> [But this is not going to change the principle to which I'm rallying now...]
> 
> 
> Regards,
> Gilles
> 
>>
>> best regards,
>> Luc
>>
>>>>
>>>> It's not always possible and it was proved by Turing in 1936. It is an
>>>> undecidable problem.
>>>
>>> I think that this is a slight overstatement. That there is no algorithm that
>>> can decide whether an _arbitrary_ program will halt or not on an _arbitrary_
>>> input does not imply that it's impossible to predict what a _specific_
>>> program will do on a _specific_ input. I'm sure that you're a very good
>>> programmer and that your application will definitely decide whether the
>>> launch time is within the limits or not. :-)
>>>
>>>
>>> Best,
>>> Gilles
>>>
>>>> best regards,
>>>> Luc
>>>>
>>>>> The fact that CM checks for the validity of the preconditions is for
>>>>> robustness: It is a last resort check in order to fail early, with a
>>>>> meaningful report.
>>>>> And indeed, when all goes fine (as it does in most cases), the CM checks
are
>>>>> _redundant_; but we (as application developers) are willing to pay this
>>>>> price for the added security, in those cases that are our own programming
>>>>> _bugs_.
>>>>>
>>>>>
>>>>> Best regards,
>>>>> Gilles
>>>>>
>>>>>>
>>>>>>> What I am missing is how knowing that an
>>>>>>> aspecific RTE came from within [math] makes a difference.  I
am
>>>>>>> skeptical about ever depending on that kind of conclusion because
>>>>>>> dependencies may bring [math] code in at multiple levels.  Also,
is
>>>>>>> there an implied assumption in your ideal setup that *no* exceptions
>>>>>>> propagate to [math] clients other than MRTE (i.e. we catch and
wrap
>>>>>>> everything)?
>>>>>>
>>>>>> No, I don't make this assumption. I consider that at upper levels,
code
>>>>>> can receive exception from all layers underneath ([math] at the very
>>>>>> bottom, but also other layers in between). With two or three layers,
you
>>>>>> can still handle a few library-wide exceptions (see my example with
>>>>>> MathRuntimeException, and MylibException above). However, if at one
>>>>>> level the development rules state that all exception must be caught
and
>>>>>> wrapped (this happens in some critical systems contexts), then a
single
>>>>>> root hierarchy helps a lot.
>>>>>>
>>>>>> My point is that with a single root, we can get the best of two worlds:
>>>>>> large scope catches and pinpointed catches. The choice remains open
for
>>>>>> users. With a multi-rooted hierarchy, we force users to duplicate
the
>>>>>> same work for all exceptions we may throw, and we also force them
to
>>>>>> recheck everything when we publish a new version, even despite we
>>>>>> ourselves fail to document these exceptions accurately.
>>>>>>
>>>>>> best regards,
>>>>>> Luc
>>>>>>
>>>>>>>
>>>>>>> Phil
>>>>>>>>
>>>>>>>> For sure, this is something that can be done only for a major
release.
>>>>>>>>
>>>>>>>>>>> Client apps cannot do more with checked exceptions,
and can be made as
>>>>>>>>>>> "safe" by wrapping calls in try-blocks.
>>>>>>>>>>> On the other hand, client source code is much
cleaner without unnecessary
>>>>>>>>>>> "throws" clauses or wrapping of checked expections
at all levels.
>>>>>>>>>>> Some Java experts go as far as saying that checked
exceptions were a
>>>>>>>>>>> language design mistake (never repeated in languages
invented more
>>>>>>>>>>> recently).
>>>>>>>>>>>
>>>>>>>>>>>> There is a reason that NaNs exist.  It is
much cheaper to return a
>>>>>>>>>>>> NaN than to raise (and force the client to
handle) an exception. 
>>>>>>>>>>>> This is not precise and probably can't be
made so, but I have always
>>>>>>>>>>>> looked at things more or less like this:
>>>>>>>>>>>>
>>>>>>>>>>>> 0) IAE (which I see no need to specialize
as elaborately as we have
>>>>>>>>>>>> done in [math]) is for clear violation of
the documented API
>>>>>>>>>>>> contract.  The actual parameters "don't make
sense" in the context
>>>>>>>>>>>> of the API.
>>>>>>>>>>> The "elaboration" is actually very basic (but
that's a matter of taste), but
>>>>>>>>>>> it was primarily promoted (by me) in order to
hide (as much as possible) the
>>>>>>>>>>> ugliness (another matter of taste) of the "LocalizedFormats"
enum, and its
>>>>>>>>>>> inconsequent use (duplication). [Cf. discussions
in the archive.]
>>>>>>>>>>>
>>>>>>>>>>>> 1) NaN can be returned as the result of a
computation which, when
>>>>>>>>>>>> started with legitimate arguments, does not
result in a
>>>>>>>>>>>> representable value.
>>>>>>>>>>> According to this description, Sébastien's case
_must_ be handled by an
>>>>>>>>>>> exception: the argument is _not_ legtimate.
>>>>>>>>>>> The usage of NaN I was referring to is to let
a computation proceed ("follow
>>>>>>>>>>> an unexceptional path") in the hope that the
final result might still be
>>>>>>>>>>> meaningful.
>>>>>>>>>>> If the NaN persists, not checking for it and
signalling the problem (i.e.
>>>>>>>>>>> raise an exception) is a bug. This is to avoid
that (and be robust) that we
>>>>>>>>>>> do extensive precondition checks in CM. But this
has the unavoidable
>>>>>>>>>>> drawback that the use of NaN as suggested is
much less likely to be feasible
>>>>>>>>>>> when calling CM code. Once realizing that, it
becomes much less obvious that
>>>>>>>>>>> there is _any_ advantage of letting NaNs propagate...
>>>>>>>>>>> [Anyone has an example of NaN usage? Please let
me know.]
>>>>>>>>>> I use NaN a lot as an indicator that a variable has
not been fully
>>>>>>>>>> initialized yet. This occurs for example in iterative
algorithms, where
>>>>>>>>>> some result is computed deep inside some loop and
we don't know when the
>>>>>>>>>> loop will end. Then I write something along these
lines:
>>>>>>>>>>
>>>>>>>>>>   while (Double.isNaN(result)) {
>>>>>>>>>>      // do something that hopefully will change result
to non-NaN
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>>   // now I know result has been computed
>>>>>>>>>>
>>>>>>>>>> Another use is to initialize some fields in class
to values I know are
>>>>>>>>>> not meaningful. I can then us NaN as a marker to
do lazy evaluation for
>>>>>>>>>> values that takes time to compute and should be computed
only when both
>>>>>>>>>> really needed and when everything required for their
computation is
>>>>>>>>>> available.
>>>>>>>>> I should have said "[...] example of NaN usage, beyond
singling out
>>>>>>>>> unitialized data [...]". The above makes use of NaN as
"invalid" because it
>>>>>>>>> is not initialized (yet).
>>>>>>>> Yes.
>>>>>>>>
>>>>>>>>> I'd assume that if "result" stays NaN after the allowed
number of
>>>>>>>>> iterations, you raise an exception, i.e. you don't propagate
NaN as the
>>>>>>>>> output of a computation that cannot provide a useful
result. However, this
>>>>>>>>> (propagating NaN) is the behaviour of "srqt(-1)", for
example.
>>>>>>>>> Thus, if you raise an exception, your computation does
not behave in the
>>>>>>>>> same way as the function "sqrt".
>>>>>>>>>
>>>>>>>>>> Another use is simply to detect some special cases
in computations (like
>>>>>>>>>> sqrt(-1) or 0/0). I do the computation first and
check the NaN
>>>>>>>>>> afterwards. See for example the detection of NaNs
in the linear
>>>>>>>>>> combinations in MathArrays or in the nth order Brent
solver.
>>>>>>>>> OK, this is a good example, in line with the intended
usage of NaN (as it
>>>>>>>>> avoids inserting control structures in the computation).
>>>>>>>> Yes. One of the main use case for this is when a computation
involves a
>>>>>>>> loop and failure is very rare. So we avoid costly numerous
if statements
>>>>>>>> within the loop and do a single check. In the few cases this
single
>>>>>>>> check fails, we go to a diffrent branch to handle the failure.
This is
>>>>>>>> exactly what is done in linear combination.
>>>>>>>>
>>>>>>>>>> Another use of NaNs occurs when integrating various
code components from
>>>>>>>>>> different origins in a single application. Data is
forwarded between the
>>>>>>>>>> various components in all directions. Components
never share the same
>>>>>>>>>> exceptions mechanisms. Components either process
NaNs specially (which
>>>>>>>>>> is good) or they let the processor propagate them
(it is what the IEEE
>>>>>>>>>> standard mandates) and at the end you can detect
it reliably at
>>>>>>>>>> application level.
>>>>>>>>> I'm not sure I understand this. Is it good or bad that
a component lets NaNs
>>>>>>>>> propagate? Are there situations when it's good and others
where it's bad?
>>>>>>>> In the cases I encountered, it is always good to have NaNs
propagated. A
>>>>>>>> component that is not an application by itself but only a
part (low or
>>>>>>>> intermediate level) often cannot decide at its level how
to handle NaNs
>>>>>>>> except in rare cases. So it propagates them upward. The previous
example
>>>>>>>> (linear combination in [math]) is of course a counter-example:
we are at
>>>>>>>> low level, we know how to handle the NaN for this operation,
so we
>>>>>>>> detect it and fix it.
>>>>>>>>
>>>>>>>>> That's why I was asking (cf. quote from previous post
below) what are the
>>>>>>>>> criteria, so that contributors know how to write code
when the feature falls
>>>>>>>>> in one or the other category.
>>>>>>>>>
>>>>>>>>>>>> The problem is that contracts can often be
written so that instances
>>>>>>>>>>>> of 1) are turned into instances of 0).  Gamma(-)
is a great
>>>>>>>>>>>> example.  The singularities at negative integers
could be viewed as
>>>>>>>>>>>> making negative integer arguments "illegal"
or "nonsense" from the
>>>>>>>>>>>> API standpoint,
>>>>>>>>>>> They are just nonsense (not just from an API
standpoint).
>>>>>>>>>>>
>>>>>>>>>>>> or legitimate arguments for which no well-defined,
>>>>>>>>>>>> representable value can be returned.  Personally,
I would prefer to
>>>>>>>>>>>> get NaN back from this function and just
point out the existence of
>>>>>>>>>>>> the singularities in the javadoc.
>>>>>>>>>>> This is consistent with how basic math functions
behave, but not with the
>>>>>>>>>>> general rule/convention of most of CM code.
>>>>>>>>>>> It may be fine that we have several ways to deal
with exceptional
>>>>>>>>>>> conditions, but it might be nice, as with formatting,
to have rules so that
>>>>>>>>>>> we know how to write contributions.
>>>>>>>>>> Too many rules are not a solution, especially when
there are no tools to
>>>>>>>>>> help enforce these rules are obeyed. Relying only
on the fact human
>>>>>>>>>> proof-reading will enforce them is wishful thinking.
>>>>>>>>>>
>>>>>>>>> What is "too many"? ["How long should a person's legs
be?" ;-)]
>>>>>>>>> I don't agree with the "wishful thinking" statement;
a "diff" could probably
>>>>>>>>> show a lot a manual corrections to the code and comment
formatting. [Mainly
>>>>>>>>> in the sources which I touched at some point...]
>>>>>>>> I'm not sure I understand your point. Mine is that rules
that are not
>>>>>>>> backed by automated tools are a pain to enforce, and hence
are not
>>>>>>>> fulfilled most of the time, except at a tremendous human
resource cost.
>>>>>>>> In fact, even rules which can be associated with tools are
broken during
>>>>>>>> development for some time. We do not use
>>>>>>>> checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example,
but do a
>>>>>>>> fix pass from time to time.
>>>>>>>>
>>>>>>>>> There are other areas where there is only human control,
namely the "svn
>>>>>>>>> log" messages where (no less picky) rules are enforced
just because it
>>>>>>>>> helps _humans_ in their change overview task.
>>>>>>>>>
>>>>>>>>> As pointed out by Jared, it's not a big problem to comply
with rules once
>>>>>>>>> you know them.
>>>>>>>> I fully agree with that, but I also think Phil is right when
he says too
>>>>>>>> many rules may discourage potential contributors. I remember
a link he
>>>>>>>> sent to us months ago about to a presentation by Michael
Meeks about
>>>>>>>> interacting with new developers
>>>>>>>> <http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>.
>>>>>>>> Slides numers 3 an 4 are a fabulous example. I think we are
lucky Jared
>>>>>>>> has this state of mind and accepts picky rules easily. I'm
not sure such
>>>>>>>> an open mind is widespread among potential contributors.
>>>>>>>>
>>>>>>>>> Keeping source code tidy is quite helpful, and potential
contributors will
>>>>>>>>> be happy that they can read any CM source files and immediately
recognize
>>>>>>>>> that they are part of the same library...
>>>>>>>> Yes, of course. But the entry barrier should not be too high.
>>>>>>>>
>>>>>>>> best regards,
>>>>>>>> Luc
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Gilles
>>>>>>>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message