commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Brisard <sebastien.bris...@m4x.org>
Subject Re: [math] [linear] immutability
Date Tue, 01 Jan 2013 19:39:11 GMT
Hi Konstantin,


2013/1/1 Konstantin Berlin <kberlin@gmail.com>

> Here are my 2 cents:
>
> The rationale for redesign of the RealMatrix hierarchy is because a large
> amount of numerical packages depend (or should depend) on a linear algebra
> package for efficient implementation. However, as it currently stands, it
> is not possible (or unnecessarily burdensome) to create a user defined
> class that would have only the necessary and sufficient properties to
> perform such a task.
>
> That is actually a good point. I agree that CM should make it easy to
interface with well established native libraries, even if default
implementations provided in the core CM library should be pure java, as
stated in the "Guiding principles". There are many places where this should
be kept in mind, besides linear algebra. FFT is another example (I've been
writing a JNI wrapper to FFTW, but it does not blends well in CM).


> In my view immutability cannot be a property of the base matrix class, for
> the same reason as Collection classes cannot be immutable. Since a large
> amount of matrix algorithms are iterative in nature, this would make vast
> majority of algorithms intractable computationally. Just look at basic
> Gaussian elimination. Assuming you provide a vector operation for the
> inner-most loop in the API, you will still force a copy during each
> iteration of the two outermost loops. In total, this would force O(N^4)
> copies, while the actual algorithm is O(N^3). You can add Gaussian
> elimination, which is a bad idea, because you will never be able to
> implement ever algorithm in existence. In addition you have no forced your
> Gaussian elimination class to assume the most general case, and hence you
> force an O(N^4) implementation of the algorithm, since you cannot assume
> that the input matrix class has in place operations.
>
> Immutability is a very nice property, but only works if your class is
> described by a small number of member classes. This is not the case here.
> If immutability is desired for some applications, a defensive copy can be
> made. I completely disagree with your statement about how multi-threaded
> code. Please provide an example. As a note, the Java Collections package is
> a good example  of a similar problem.
>
> My understanding is that we try to enforce immutability in CM as much as
possible, when it does not damage the performances (memory- or speed-wise).
As you point out, immutability is probably not a good idea (!) in linear
algebra packages. I think I was the one who raised this issue of immutable
matrices in a previous thread. What I had in mind was sparse matrices.
Indeed, besides the bugs we have discovered in the last twelve months
(which led to their deprecation), Gilles mentioned the poor performances of
our implementations. I thought that maybe it was due to the underlying
(dynamic) data structure for sparse vectors/matrices
(OpenIntToDoubleHashMap), while typical storage schemes (compressed row,
skyline) might be more efficient (since only arrays are used), but do not
lend themselves to changes of the value of initially null coefficients.
That's why I was suggesting immutable matrices as a start, but what I
really meant was matrices where the null coefficients are constant.

Please note that your objection does not hold with iterative linear solvers
(conjugate gradient, ...), so immutable matrices might still be interesting.

I see issue MATH-765, however I don't think it's possible to do anything
> until there is a specific proposal for an interface, where we can add
> comments.
>
> I didn't understand that we were going for a full redisign of the linear
package. One feature I would really love to implement is views. Also,
please note that Ted mentioned a very important point: the need for
standardized tests. I've done that for RealVectors (all implementations are
unit tested using the same abstract test suite). It should also be done
with matrices... but it's a long, tedious refactoring work!!!


> P.S. I would also be interested if people think Vector should be made into
> a special case of a Matrix.


That's an interesting idea.


> Also I think the RealMatrix interface should not be the starting`
> interface, but rather a midpoint in the hierarchy. A merging point, where
> all the interfaces needed for the optimization package, and other packages,
> come together to form something more general. While a lot of functionality
> in the matrix class, like visitors etc, are needed for implementing norms
> and other things, they are not needed by the iterative optimizers.
>

I agree. It's also true of iterative linear solvers (which should not be
discarded from this thread!!!), which only require a RealLinearOperator.

Best regards,
Sébastien


>
> On Dec 31, 2012, at 8:07 PM, Gilles Sadowski <gilles@harfang.homelinux.org>
> wrote:
>
> > Hi.
> >
> >>> If we stick to
> >>>
> >>> 0) algebraic objects are immutable
> >>> 1) algorithms defined using algebraic concepts should be implemented
> >>> using algebraic objects
> >>>
> >>> ...
> >>> 0)  Start, with Konstantin's help, by fleshing out the InPlace
> >>> matrix / vector interface
> >>> 1)  Integrate Mahout code as part of a wholesale refactoring of the
> >>> linear package
> >
> > What do you mean by this?
> > Copy/paste or create a dependency? Something else?
> >
> >>> 2)  Extend use of the visitor pattern to perform mutations
> >>> "in-place" (similar to 0) in effect)
> >>>
> >
> > As suggested in a previous post:
> >
> > 3) a) Define a new "minimal matrix" interface, and create immutable
> >      implementations.
> >   b) Benchmark critical methods (entry access, iteration, add, multiply,
> >      etc.)
> >   c) Quantify the efficiency gain of in-place operations and only when
> this
> >      information is available decide whether the gain is worth the price.
> >      [Even if in-place operations are faster in a single thread context,
> it
> >      is not sure that immutability would not change that in a
> multi-thread
> >      implementation. Trying to outperform multi-threaded code with
> in-place
> >      operations is a dead end.]
> >
> > Before embarking on any of this, please identify the rationale: Is there
> > _one_ identified problem that would require urgent action? This
> discussion
> > about clean-up/improvement/simplification of the CM matrix
> implementations
> > has been going on for months, and we should not start a "new" discussion
> > without referring to what has been recorded by Sébastien on JIRA.
> >
> >
> > Regards,
> > Gilles
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message