commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [Math] Fluent API, inheritance and immutability
Date Thu, 08 Aug 2013 15:14:27 GMT
Hello.

>
> Sorry for the confusion my code sample caused; it made sense in my 
> mind.
> :) I was speaking from the perspective that an abstract class is a
> skeletal implementation of an interface, created so that implementing
> the interface is easier. For the non-linear least squares (NLLS) 
> package
> I was thinking something like: 
> https://gist.github.com/anonymous/6184665

Thanks for the effort of writing the example in more details.

> To address a few concerns I saw,
>
> Gilles wrote:
>> In this way, it does not; what I inferred from the partial code 
>> above was
>> that there would only be _one_ "create" method. But:
>> 1. All "create" methods must be overridden at the next level of the 
>> class
>> hierarchy (this is the duplication I was referring to in the first
>> post).
>
> True, but concrete classes should not be extended.

Why not?
[Anyways, the duplication also occurs in the intermediate (abstract) 
levels.]

> Adding another
> abstract class to the hierarchy would mean implementing the create
> method form the superclass and delegating to a new abstract create
> method that includes the new parameters. The abstract class hierarchy
> would have to parallel the interface hierarchy.

With a mutable instance, you avoid this; that's the point.

>
>> 2. When a parameter is added somewhere in the hierarchy, all the 
>> classes
>> below must be touched too.
>
> True, but since abstract classes are just skeletal implementations of
> interfaces you can't add any methods without breaking compatibility
> anyway.

Adding a (concrete) method in an abstract class will not break 
compatibility.

> (You would have to add the same method to the public interface
> too.)

There you break compatibility; that's why we removed a lot of the
interfaces in CM, because the API is not stable, and abstract classes
allow non-breaking changes.

> This does make it important to decide on a well written and
> complete API before releasing it.

When the scope of the software is well circumscribed, that would be
possible. With the whole of [Math]ematics, much less so. :-}
And state-of-the-art in Java is a moving target, aimed at by changing
CM contributors with differing needs and tastes; this adds to the
unstable mix.

>> And, we must note, that the duplication still does not ensure "real"
>> immutability unless all the passed arguments are themselves 
>> immutable.
>> But:
>> 1. We do not have control over them; e.g. in the case of the 
>> optimizers
>> the "ConvergenceChecker" interface could possibly be implemented by 
>> a
>> non-thread-safe class (I gave one example of such a thing a few 
>> weeks
>> ago when a user wanted to track the optimizer's search path)
>
> True. Thread safety is a tricky beast. I think we agree that the only
> way to guarantee thread safety is to only depend on final concrete
> classes that are thread safe themselves.

I don't think so. When objects are immutable, thread-safety follows 
(but
note that the current optimizers in CM were never thread-safe).
But thread-safety can exist even with mutable objects; it's just that 
more
care must be taken to ensure it.

> This is directly at odds with
> the inversion of control/dependency injection paradigm. I think a
> reasonable compromise is to depend on interfaces and make sure all 
> the
> provided implementations are thread safe.

Yes, that a way, but again easier said that done.

> A simple sequential user won't
> need to care about thread safety. A concurrent user will need to
> understand the implications of Java threading to begin with. Accurate
> documentation of which interfaces and methods are assumed to be 
> thread
> safe goes a long way here.

I don't think I'm wrong if I say that most concurrent bugs are found in
production rather than in contrived test cases.
[That's why I advocated for introducing "real" applications as 
use-cases
for CM.]

>> 2. Some users _complained_ (among other things :-) that we should 
>> not
>> force immutability of some input (model function and Jacobian IIRC)
>> because in some use-cases, it would be unnecessarily costly.
>
> I agree that copying any large matrices or arrays is prohibitively
> expensive. For the NLLS package we would be copying a pointer to a
> function that can generate a large matrix. I think adding some
> documentation that functions should be thread safe if you want to use
> them from multiple threads would be sufficient.

I you pass a "pointer" (i.e. a "reference" in Java), all bets are off: 
the
class is not inherently thread-safe. That's why I suggested to mandate 
a
_deep_ "copy" method (with a stringent contract that should allow a 
caller
to be sure that all objects owned by an instance are disconnected from 
any
other objects).

>
>> Consequently, if (when creating a new instance) we assign a 
>> reference
>> passed to the fluent method, we cannot warrant thread-safety 
>> anymore;
>> which in turn poses the question of whether this false sense of 
>> security
>> warrants the increased code duplication and complexity (compare your 
>> code
>> below with the same but without the constructors and "create" 
>> methods:
>> even a toy example is already cut by more than half).
>
>  Agreed that thread safety can only be guaranteed with the help of 
> the
> user. The immutable+fluent combination does add an additional layer 
> of
> indirection. On the other hand it is much simpler to debug and 
> analyze,
> especially in a concurrent environment.

Immutable objects can be shared between threads.
Thread-safety is equally obtained if mutable objects are not shared 
between
threads, e.g. keep the optimizer confined in one thread (no 
implementation
in CM features concurrency anyways).
Even if the optimizers are not immutable, it will be possible to 
benefit
from efficiency improvement brought by concurrency e.g. by ensuring 
that the
objective function is thread-safe, several evaluations could be 
performed in
parallel. [Moreover the optimization logic is not really concurrent in 
most
optimizers. So why have unnecessarily complicated code?]

>
>> Then if we really want to aim at thread-safety, I think that the 
>> approach
>> of mandating a "copy()" interface to all participating classes, 
>> would be
>> a serious contender to just making all fields "final". Let's also 
>> recall
>> that immutability is not a goal but a means (the goal being
>> thread-safety).
>>
>> I still think that the three "tools" mentioned in the subject line 
>> do not
>> play along very well, unfortunately.
>> I was, and still am, a proponent of immutability but IMO this 
>> discussion
>> indicates that it should not be enforced at all cost. In particular, 
>> in
>> small and non-layered objects, it is easy to ensure thread-safety
>> (through
>> "final") but the objects that most benefit from a fluent API are big
>> (with
>> many configuration combinations), and their primary usage does not
>> necessarily benefit from transparent instantiation.
>>
>> Given all these considerations, in the first steps for moving some 
>> CM
>> codes
>> to fluent APIs, I would aim for simplicity and _less_ code (if just 
>> to be
>> able to make adjustments more quickly).
>> Then, from "simple" code, we can more easily experiment (and 
>> compare,
>> with
>> "diff") the merits and drawbacks of various routes towards 
>> thread-safety.
>>
>> OK?
>
> Agreed on the goal of thread safety. Is the copy a shallow copy?

No! (cf. above.)

> If it
> is, then copy is a complete punt of thread-safety to the user, 
> forcing
> them to them to determine when and what they need to copy. I think 
> the
> mutable+copy option would make it harder to understand client code 
> and
> understand where copying is necessary.

No; my suggestion, if feasible, is that a user who needs his own, 
personal,
unshared instance would simply call "copy()":

public void myMethodOne(Optimizer optim) {
   // I don't trust that "optim" can be safe: I make a (deep) copy.
   final Optimizer myOptim = optim.copy();

   // By contract, "myOptim" is assumed to contain unshared fields.
   // I can pass it safely to another thread.
   myMethodTwo(myOptim);
}

private void myMethodTwo(Optimizer optim) {
   // Create a new thread, whatever...
}

But my current point is that this could be developed at a later point,
after the fluent API has been adopted (and more widely tried within 
CM),
when someone has shown that e.g. the optimizer must be thread-safe to
achieve some actual use-case.

> Ultimately the decision is up to
> the maintainers and I think both options under discussion are a big
> improvement over the current API. :)
>
> Thanks for the great library.

Thanks for the discussion,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message